Mechanism Design for LLM Fine-tuning with
Multiple Reward Models

Haoran Sun
Peking University
sunhaoran0301@stu.pku.edu.cn
Yurong Chen
Peking University
chenyurong@pku.edu.cn
Siwei Wang
Microsoft Research Asia
siweiwang@microsoft.com
Wei Chen
Microsoft Research Asia
weic@microsoft.com
Xiaotie Deng
Peking University
xiaotie@pku.edu.cn
Abstract

Recent research on fine-tuning large language models (LLMs) through the aggregation of multiple preferences has attracted considerable attention. However, the existing literature predominantly focuses on the empirical performance of aggregation algorithms, while neglecting the underlying motivation for agents to misreport their preferences. In this paper, we formalize this as a multi-parameter mechanism design problem, where an LLM provider designs both training and payment rules to achieve specific objectives and promote the truthful reporting of preferences. Firstly, we claim the necessity of a payment scheme by demonstrating that without payments, truth-telling is a strictly dominated strategy under a wide range of training rules. Then, we introduce the affine maximizer payment scheme for the social welfare maximizing training rules that are widely used in practice, which ensures both dominant-strategy incentive compatibility (DSIC) and individual rationality (IR). Furthermore, we prove that under mild conditions, any other payment rule that also implements these training rules in DSIC can be converted to the affine maximizer payment by adding a factor irrelevant to the agents’ own reports. We also show that this mechanism satisfies approximate DSIC when the input of the mechanism is a biased version of the reported preferences, showcasing its robustness in real-world applications.

1 Introduction

The pre-training and fine-tuning paradigm is fundamental in developing language models ( Devlin et al. ( 2018 ); Radford et al. ( 2018 ); Liu et al. ( 2019 ); Touvron et al. ( 2023 ) ). During pre-training, the model is fed with vast amounts of data to acquire a general capability to understand and generate language through self-supervised learning. The subsequent fine-tuning phase customizes these pre-trained models for specific downstream tasks using smaller, task-oriented datasets, ensuring that the model outputs are more closely aligned with particular requirements. As LLMs gain increasing popularity, there is a growing demand for fine-tuning basic LLMs, as basic models often fail to meet users’ demands, especially in catering to individual preferences.

The process of fine-tuning an LLM to align with certain human preferences is challenging to achieve through supervision ( Ji et al. ( 2023 ); Köpf et al. ( 2024 ); Wang et al. ( 2023b ); Shen et al. ( 2023 ) ), primarily due to the difficulty in constructing datasets with a substantial number of valid question-answer pairs for supervised training. Reinforcement learning from human feedback (RLHF) ( Ouyang et al. ( 2022 ); Christiano et al. ( 2017 ) ) offers a promising solution to this problem. In RLHF, a reward model is first trained to be used as a proxy for human judgment. This model then provides reward signals for the standard reinforcement learning process. This technique of fine-tuning with a reward model has proven effective in encoding human preferences into models and has become a fundamental component of the training process for most advanced LLMs. With the advancement of RLHF, numerous studies have investigated efficient methods for aggregating multiple preferences into a single fine-tuned model.

However, most of these studies focus primarily on improving empirical performance across various metrics ( Ramé et al. ( 2024 ); Wu et al. ( 2024 ); Jang et al. ( 2023 ); Coste et al. ( 2023 ); Zhang et al. ( 2024 ); Wang et al. ( 2024 ); Eisenstein et al. ( 2023 ) ). They often implicitly assume that we are accessible to real preferences, neglecting the possibility of agents’ misreporting their preferences. This problem becomes more crucial when we consider a real-world scenario, where different agents provide their preferences for the aggregation. In such cases, agents may engage in strategic misreporting to increase their utility. An intuitive example is that if an agent knows beforehand that the fine-tuning process aims to neutralize all preferences, it might pretend to have a more polarized preference as a beneficial strategy. These strategic behaviors can distort the final training results, even if the trained algorithm is highly effective. Nevertheless, this issue has not attracted sufficient attention in the existing literature, particularly concerning the fine-tuning process of LLMs.

Our Contribution.

In this paper, we mainly study the incentive design in such scenarios. First, we formalize this as a multi-parameter mechanism design problem between a fine-tuning service provider and groups of agents seeking fine-tuning services. The provider proposes a mechanism that includes a training rule for integrating different groups’ preferences into a fine-tuned model and a payment rule to charge the groups. After observing the mechanism, each group strategically reports its preference to maximize its utility. We consider that the subsequent fine-tuning process is implemented using RLHF, a standard method for aligning a model with human preference. Therefore, we abstract the preference of each group to be reward models, and term the whole scenario the RLHF Game .

Secondly, we demonstrate the profitability of misreporting a polarized preference under a wide range of mechanisms that include only a training rule ( Theorem 3.3 ). This underscores the necessity of a payment rule to address incentive issues.

Thirdly, we focus on a representative set of training rules, termed the SW-Maximizing training rules, in which the provider aims to maximize social welfare while incorporating different regularization measures. For SW-Maximizing training rules, we propose the affine maximizer payment scheme, a weighted version of the Vickrey-Clarke-Groves (VCG) payment ( Vickrey ( 1961 ); Clarke ( 1971 ); Groves ( 1973 ) ). We prove that agents truthfully reporting their preferences constitutes a dominant strategy in such mechanisms ( Theorem 4.2 ). Utilizing the notion of payment equivalence, we prove that under a mild condition, any other payment rule that also implements these training rules in dominant-strategy incentive compatibility (DSIC) can be converted to the affine maximizer payment by adding a factor irrelevant to agents’ own reports ( Theorem 4.5 ). We validate this condition for many commonly used regularization terms like KL-divergence ( Proposition 4.4 ). Consequently, we derive the revenue-maximizing payment rule that implements SW-Maximizing training rules in both DSIC and individual rationality (IR) ( Corollary 4.6 ). Furthermore, we show that this mechanism remains approximately DSIC when the input of the mechanism is a biased version of the reported preferences, which is an abstraction modeling for the inevitable errors that occur in practice. This showcases the robustness of the proposed mechanisms in real-world applications ( Theorem 4.9 ).

Primary Related Work.

Several studies have investigated similar scenarios. Among them, Duetting et al. ( 2023 ) and Soumalias et al. ( 2024 ) are most related to ours. Duetting et al. ( 2023 ) examines the problem of designing a mechanism to aggregate multiple agents’ preferences based on each agent’s bids and determine their payments. However, they exclude the case where preferences can be misreported, which is the primary concern in our study. The concurrent work by Soumalias et al. ( 2024 ) also considers the mechanism design for aggregating multiple preferences. Their focus is mainly on the practical implementation of SW-Maximizing training rule with KL-divergence and the payment scheme that obtains both DSIC and interpretability. However, in this scenario, we are more concerned with the theoretical properties of more general mechanisms, including the implementability and the property of payment equivalence.

Additionally, there are works studying other scenarios related to LLMs from the perspective of algorithmic game theory. Laufer et al. ( 2023 ) abstracts the fine-tuning process as a bargaining game and characterizes the perfect sub-game equilibria. Dubey et al. ( 2024 ) proposes an auction where bidders compete to place their content within a summary generated by an LLM. Conitzer et al. ( 2024 ) considers incorporating social choice theory in LLM alignment. Feizi et al. ( 2023 ) explores the potential for leveraging LLMs in online advertising systems.

Paper Organization.

In Section 2 , we provide the preliminaries and the formal description of the RLHF Game. In Section 3 , we study the incentive design for general training rules in the RLHF Game. We demonstrate the properties of mechanisms that consist of SW-Maximizing training rules and payment rules in Section 4 . Further related work is provided in Section 5 , and we conclude in Section 6 .

2 Preliminaries and Model

2.1 Preliminaries

Large Language Models.

Large language models (LLMs) function as mappings from a sequence of tokens to a probability distribution over the next token. The input sequence is usually constrained by a maximum length K 𝐾 K , thereby making the set of all possible inputs finite. Let T 𝑇 T denote the finite set of all tokens, and let T T T 2 T K superscript 𝑇 𝑇 superscript 𝑇 2 superscript 𝑇 𝐾 T^{\ast}\coloneqq\emptyset\cup T\cup T^{2}\cup\cdots\cup T^{K} represent the set of all possible input sequences with lengths up to K 𝐾 K .

An LLM parameterized by θ Θ 𝜃 Θ \theta\in\Theta is denoted as g θ : T Δ T : subscript 𝑔 𝜃 superscript 𝑇 Δ 𝑇 g_{\theta}:T^{\ast}\to\Delta T , where Δ T Δ 𝑇 \Delta T is the set of all probability distributions over the token set T 𝑇 T . For practical purposes, the output sequence is also required to be of finite length. We assume the maximum output length is also K 𝐾 K so that the output space is also T superscript 𝑇 T^{\ast} . We denote LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \text{LLM}_{\theta}({\bm{x}}) as the probability of a sequence of tokens 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} generated by g θ subscript 𝑔 𝜃 g_{\theta} . Since the model generates a sequence by predicting the next token iteratively until a special ending token is encountered, the relationship between LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} and g θ subscript 𝑔 𝜃 g_{\theta} is given by:

LLM θ ( 𝒙 ) = t = 1 | 𝒙 | g θ ( x t 𝒙 < t ) , subscript LLM 𝜃 𝒙 superscript subscript product 𝑡 1 𝒙 subscript 𝑔 𝜃 conditional subscript 𝑥 𝑡 subscript 𝒙 absent 𝑡 \text{LLM}_{\theta}({\bm{x}})=\prod_{t=1}^{|\bm{x}|}g_{\theta}(x_{t}\mid{\bm{x}}_{<t}),

where 𝒙 < t subscript 𝒙 absent 𝑡 {\bm{x}}_{<t} denotes the prefix subsequence of 𝒙 𝒙 {\bm{x}} preceding x t subscript 𝑥 𝑡 x_{t} and 𝒙 < 1 = subscript 𝒙 absent 1 {\bm{x}}_{<1}=\emptyset . LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} is a distribution over T superscript 𝑇 T^{\ast} and can be represented as a | T | superscript 𝑇 |T^{\ast}| -dimensional vector, with each coordinate 𝒙 𝒙 {\bm{x}} the probability of 𝒙 𝒙 {\bm{x}} being generated under g θ subscript 𝑔 𝜃 g_{\theta} .

Reward Modeling.

Reward modeling is instrumental for aligning LLMs with human preferences, particularly within the context of RLHF. In this process, a reward model rm : T : rm superscript 𝑇 \text{rm}:T^{\ast}\to\mathbb{R} is first trained on the human-annotated preference dataset by using the Bradley-Terry model ( Bradley and Terry ( 1952 ) ). Essentially, the reward model is a function that maps a sequence of tokens to a real number indicative of the preference for that sequence. Similar to LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} , rm can be also considered as a | T | superscript 𝑇 |T^{\ast}| -dimensional vector. Following prior empirical work for RLHF ( Rafailov et al. ( 2023 ) ), we consider the normalized reward models which are normalized to have the summation 1 1 1 over T superscript 𝑇 T^{\ast} i.e. 𝒙 T rm ( 𝒙 ) = 1 subscript 𝒙 superscript 𝑇 rm 𝒙 1 \sum_{{\bm{x}}\in T^{\ast}}\text{rm}({\bm{x}})=1 . Furthermore, we also assume that the output rewards are all non-negative, i.e. rm ( 𝒙 ) 0 rm 𝒙 0 \text{rm}({\bm{x}})\geq 0 for all 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} . Unless otherwise stated, we use \mathcal{R} to denote the domain of all reward model functions that satisfy the above conditions. In fact, the results in our paper are also applicable for some other normalization methods like max 𝒙 T rm ( 𝒙 ) = 1 subscript 𝒙 superscript 𝑇 rm 𝒙 1 \max_{{\bm{x}}\in T^{\ast}}\text{rm}({\bm{x}})=1 .

2.2 Formulation of the RLHF Game

In this part, we present the formal description of the RLHF Game. There is one LLM provider and n 𝑛 n groups of agents , denoted by [ n ] = { 1 , 2 , , n } delimited-[] 𝑛 1 2 𝑛 [n]=\{1,2,\cdots,n\} . The provider has an initial model LLM θ init subscript LLM subscript 𝜃 init \text{LLM}_{\theta_{\text{init}}} with non-negative probability for all sequences, i.e. LLM θ init ( 𝒙 ) > 0 subscript LLM subscript 𝜃 init 𝒙 0 \text{LLM}_{\theta_{\text{init}}}({\bm{x}})>0 for all 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} . Each group i 𝑖 i has w i subscript 𝑤 𝑖 w_{i} agents and a joint preference represented by a reward model rm i subscript rm 𝑖 \text{rm}_{i} . Let \mathcal{R} and 𝒲 + 𝒲 subscript \mathcal{W}\subseteq\mathbb{N}_{+} denote the domains for each group’s reward model and group size, respectively. We assume an upper bound w ¯ ¯ 𝑤 \bar{w} for 𝒲 𝒲 \mathcal{W} . The exact reward model rm i subscript rm 𝑖 \text{rm}_{i} and the size w i subscript 𝑤 𝑖 w_{i} are group i 𝑖 i ’s private information. For an agent in group i 𝑖 i , the valuation when it receives a model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} is denoted by v i ( θ ; rm i ) subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 v_{i}(\theta;\text{rm}_{i}) . The form of the valuation function v i ( ; ) subscript 𝑣 𝑖 v_{i}(\cdot;\cdot) is known by both the provider and the agents.

The provider first announces the mechanism, including a training rule ψ 𝜓 {\psi} and a payment rule p 𝑝 p ,

ψ : n × 𝒲 n × Θ Θ , : 𝜓 superscript 𝑛 superscript 𝒲 𝑛 Θ Θ \displaystyle{\psi}:\mathcal{R}^{n}\times\mathcal{W}^{n}\times\Theta\to\Theta, p : n × 𝒲 n × Θ n . : 𝑝 superscript 𝑛 superscript 𝒲 𝑛 Θ superscript 𝑛 \displaystyle p:\mathcal{R}^{n}\times\mathcal{W}^{n}\times\Theta\to\mathbb{R}^{n}.

Both rules take n 𝑛 n reported reward models, n 𝑛 n reported sizes, and an initial model as input, and output the objective fine-tuned model and each group’s payment, respectively. The provider can choose not to charge the users by setting p 𝑝 p always equal to 0 0 . In this case, the model coincides with most previous work, where agents’ incentives are not considered ( Ramé et al. ( 2024 ); Wu et al. ( 2024 ); Jang et al. ( 2023 ); Coste et al. ( 2023 ); Zhang et al. ( 2024 ); Wang et al. ( 2024 ); Eisenstein et al. ( 2023 ) ). Specifically, the training rule seeks to find the model that maximizes a certain objective function f 𝑓 f . That is,

ψ ( rm , w , θ init ) arg max θ Θ f ( v 1 ( θ ; rm 1 ) , , v n ( θ ; rm n ) , w , D ( LLM θ | | LLM θ init ) ) , {\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})\in\arg\max_{\theta\in\Theta}f(v_{1}(\theta;\text{rm}_{1}),\cdots,v_{n}(\theta;\text{rm}_{n}),\vec{w},D(\text{LLM}_{\theta}||\text{LLM}_{\theta_{\text{init}}})),

where D 𝐷 D is a measure of the distance between LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} and LLM θ init subscript LLM subscript 𝜃 init \text{LLM}_{\theta_{\text{init}}} . We assume that the function f 𝑓 f has a unique global optimal point for any possible inputs. Hence, in the rest of the paper, the “ \in ” in the definition of ψ 𝜓 {\psi} is substituted by “=”.

After observing the announced mechanism ( ψ 𝜓 {\psi} , p 𝑝 p ), each group i 𝑖 i reports a reward model, rm ~ i subscript ~ rm 𝑖 \widetilde{\text{rm}}_{i} , and its group size w ~ i subscript ~ 𝑤 𝑖 \tilde{w}_{i} . We assume all reported sizes are in 𝒲 𝒲 \mathcal{W} and therefore bounded by w ¯ ¯ 𝑤 \bar{w} . Based on the reported information, the provider fine-tunes the model until the model LLM θ final subscript LLM subscript 𝜃 final \text{LLM}_{\theta_{\text{final}}} is optimal, i.e., the final parameter satisfies θ final = ψ ( rm ~ , w ~ , θ init ) subscript 𝜃 final 𝜓 ~ rm ~ 𝑤 subscript 𝜃 init \theta_{\text{final}}={\psi}(\overrightarrow{\widetilde{\text{rm}}},\vec{\tilde{w}},\theta_{\text{init}}) . The provider then charges group i 𝑖 i according to the payment rule, p i ( rm ~ , w ~ , θ init ) subscript 𝑝 𝑖 ~ rm ~ 𝑤 subscript 𝜃 init p_{i}(\overrightarrow{\widetilde{\text{rm}}},\vec{\tilde{w}},\theta_{\text{init}}) . All the members in the group have access to the fine-tuned model θ final subscript 𝜃 final \theta_{\text{final}} , so the valuation for group i 𝑖 i is w i v i ( θ final ; rm i ) subscript 𝑤 𝑖 subscript 𝑣 𝑖 subscript 𝜃 final subscript rm 𝑖 w_{i}v_{i}(\theta_{\text{final}};\text{rm}_{i}) . We assume all groups have quasi-linear utilities. Therefore, group i 𝑖 i ’s utility is

u i ( rm ~ , w ~ ; ψ , p , rm i , w i ) = w i v i ( θ final ; rm i ) p i ( rm ~ , w ~ , θ init ) . subscript 𝑢 𝑖 ~ rm ~ 𝑤 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝑣 𝑖 subscript 𝜃 final subscript rm 𝑖 subscript 𝑝 𝑖 ~ rm ~ 𝑤 subscript 𝜃 init u_{i}(\overrightarrow{\widetilde{\text{rm}}},\vec{\tilde{w}};{\psi},p,\text{rm}_{i},w_{i})=w_{i}v_{i}(\theta_{\text{final}};\text{rm}_{i})-p_{i}(\overrightarrow{\widetilde{\text{rm}}},\vec{\tilde{w}},\theta_{\text{init}}).

The groups may strategically report, thus rm ~ ~ rm \overrightarrow{\widetilde{\text{rm}}} and w ~ ~ 𝑤 \vec{\tilde{w}} do not necessarily equal the true rm rm \overrightarrow{\text{rm}} and w 𝑤 \vec{w} . The goal of the LLM provider is to achieve its training objective based on the group’s true preferences, taking into account that the misreporting may distort the training outcome. To this end, it is crucial to incentivize all groups to report their information truthfully so that the provider is accessible to the groups’ private information. We formally define these desiderata of a mechanism as follows.

Definition 2.1 .

A mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies dominant-strategy incentive compatibility (DSIC) if i for-all 𝑖 \forall i , rm i subscript rm 𝑖 \text{rm}_{i} , w i subscript 𝑤 𝑖 w_{i} , rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} , w i subscript superscript 𝑤 𝑖 w^{\prime}_{i} , rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , w i subscript 𝑤 𝑖 \vec{w}_{-i} , θ init subscript 𝜃 init \theta_{\text{init}} , we have

u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p , rm i , w i ) u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p , rm i , w i ) . subscript 𝑢 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑢 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 u_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});{\psi},p,\text{rm}_{i},w_{i})\geq u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p,\text{rm}_{i},w_{i}). (DSIC)
Definition 2.2 .

A mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies individually rationality (IR) if i for-all 𝑖 \forall i , rm i subscript rm 𝑖 \text{rm}_{i} , w i subscript 𝑤 𝑖 w_{i} , rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , w i subscript 𝑤 𝑖 \vec{w}_{-i} , θ init subscript 𝜃 init \theta_{\text{init}} , we have

u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p , rm i , w i ) 0 . subscript 𝑢 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 0 u_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});{\psi},p,\text{rm}_{i},w_{i})\geq 0. (IR)

DSIC means that for any group, truthfully reporting the reward model and the group size yields the highest utility, regardless of other groups’ reports. IR means that truthfully reporting always yields non-negative utilities. Only when both DSIC and IR are satisfied, all groups are incentivized to participate in this game and report truthfully. When a mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies DSIC, IR, or both DSIC and IR, we say that the payment rule p 𝑝 p implements ψ 𝜓 {\psi} in DSIC, IR or both DSIC and IR. Especially, when we say the implementability of a training rule, we refer to the property of DSIC.

3 Incentives for General Training Rules

In this section, we discuss the incentive design within the RLHF Game framework. As a warm-up, we consider a simplified scenario where all group sizes are equal to 1 1 1 , i.e., w = 1 𝑤 1 \vec{w}=1 , and this information is public to all groups and the provider. Consequently, each group is required only to report its reward model. For convenience, we let w 1 𝑤 1 \vec{w}\equiv 1 and omit the notation of w 𝑤 \vec{w} . Unless stated otherwise, the results directly apply to the more general case where w 𝑤 \vec{w} is also private information.

For the valuation function in this section, we consider a reasonable form v ( ; ) 𝑣 v(\cdot;\cdot) defined as follows.

Assumption 3.1 .

For any agent with preference represented by reward model rm , its valuation on model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} is its expected reward on the sequences generated by LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} :

v ( θ ; rm ) = 𝔼 𝒙 LLM θ rm ( 𝒙 ) = 𝒙 T LLM θ ( 𝒙 ) rm ( 𝒙 ) . 𝑣 𝜃 rm subscript 𝔼 similar-to 𝒙 subscript LLM 𝜃 rm 𝒙 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 rm 𝒙 v(\theta;\text{rm})=\mathbb{E}_{{\bm{x}}\sim\text{LLM}_{\theta}}\text{rm}({\bm{x}})=\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})\text{rm}({\bm{x}}).

In practice, this can be obtained by averaging the reward of the sequences sampled from an LLM. We discuss the influence of possible errors in this process in Section 4 .

3.1 Necessity of Payment Rule

We begin by demonstrating the necessity of payment rules to ensure incentive compatibility for training rules under the following assumptions.

Assumption 3.2 .

(1) For all i [ n ] 𝑖 delimited-[] 𝑛 i\in[n] , f / v i 𝑓 subscript 𝑣 𝑖 \partial f/\partial v_{i} exists and f / v i > 0 𝑓 subscript 𝑣 𝑖 0 \partial f/\partial v_{i}>0 . f / D 𝑓 𝐷 \partial f/\partial D exists and f / D < 0 𝑓 𝐷 0 \partial f/\partial D<0 . (2) The distance measure function D 𝐷 D satisfies that for all 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} , 2 D / LLM θ ( 𝒙 ) 2 superscript 2 𝐷 subscript LLM 𝜃 superscript 𝒙 2 \partial^{2}D/\partial\text{LLM}_{\theta}({\bm{x}})^{2} exists and is positive. (3) For all rm rm \overrightarrow{\text{rm}} and θ init subscript 𝜃 init \theta_{\text{init}} , the fine-tuned model θ = ψ ( rm , θ init ) 𝜃 𝜓 rm subscript 𝜃 init \theta={\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}}) satisfies that LLM θ ( 𝒙 ) > 0 subscript LLM 𝜃 𝒙 0 \text{LLM}_{\theta}({\bm{x}})>0 for all 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} .

The rationale of these assumptions is as follows: (1) is that we assume the training process aims to find a model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} that not only brings higher valuation for all agents but also remains close to the initial model LLM θ init subscript LLM subscript 𝜃 init \text{LLM}_{\theta_{\text{init}}} . (2) is like a convex condition in which we assign an increasingly large penalty on LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \text{LLM}_{\theta}({\bm{x}}) when it becomes farther from LLM θ init ( 𝒙 ) subscript LLM subscript 𝜃 init 𝒙 \text{LLM}_{\theta_{\text{init}}}({\bm{x}}) . And (3) is to exclude some extreme training rules that the training outcome remains the same for most input and changes drastically. In practice, (1) is satisfied for most training functions f 𝑓 f , including those aiming to maximize social welfare and Nash social welfare. (2) and (3) depend on the choice of the regularization measure D 𝐷 D and the strength of regularization. At least, they are satisfied by the commonly used KL-divergence.

Combining these three conditions, we show that when the preference for some 𝒙 𝒙 {\bm{x}} ( i = 1 n rm i ( 𝒙 ) superscript subscript 𝑖 1 𝑛 subscript rm 𝑖 𝒙 \sum_{i=1}^{n}\text{rm}_{i}({\bm{x}}) ) increases and others remain, the probability of 𝒙 𝒙 {\bm{x}} for the optimal model will also increase. In this case, an intuitive manipulation is that the agent reports a polarized reward model: higher reward value rm ~ ( 𝒙 ) ~ rm 𝒙 \widetilde{\text{rm}}({\bm{x}}) for the 𝒙 𝒙 {\bm{x}} it values most. We show that this strategy will give strictly higher utility to the agent unless the agent is indifferent among outcomes 𝒙 𝒙 {\bm{x}} in a subset S T 𝑆 superscript 𝑇 S\subseteq T^{\ast} and does not care about the outcomes outside S 𝑆 S at all.

Theorem 3.3 .

Under 3.1 and 3.2 , when the payment rule p 0 𝑝 0 p\equiv 0 , for any agent i 𝑖 i , truthfully reporting rm i subscript rm 𝑖 \text{rm}_{i} is a strongly dominated strategy, except for the case: S T 𝑆 superscript 𝑇 \exists S\subseteq T^{\ast} , such that rm i ( 𝐱 ) = 1 / | S | subscript rm 𝑖 𝐱 1 𝑆 \text{rm}_{i}({\bm{x}})=1/|S| if 𝐱 S 𝐱 𝑆 {\bm{x}}\in S and rm i ( 𝐱 ) = 0 subscript rm 𝑖 𝐱 0 \text{rm}_{i}({\bm{x}})=0 if 𝐱 S 𝐱 𝑆 {\bm{x}}\notin S .

Here, we call a strategy strongly dominated when another strategy yields strictly higher utility regardless of others’ reports. Theorem 3.3 tells us that truthful reporting is strongly dominated with only training rules, and thus will not be adopted by rational agents.

3.2 Characteristics of Payment Rules

Having established the necessity of payment rules in this scenario, we mainly address two questions in the remainder of this section: First, given a training rule ψ 𝜓 {\psi} , can we find a payment rule p 𝑝 p such that the mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies DSIC? This is the so-called implementability of a training rule ψ 𝜓 {\psi} . Second, for an implementable training rule ψ 𝜓 {\psi} , can we identify the relationship between the payment rules p 𝑝 p s among all DSIC mechanisms ( ψ , p ) 𝜓 𝑝 ({\psi},p) .

We resolve the first question primarily by utilizing the notion of cycle monotonicity , first proposed by Rochet ( 1987 ) . Cycle monotonicity generalizes monotonicity defined in a single-parameter scenario ( (Myerson, 1981 ) ). In the RLHF Game, we define a function as l ( rm , rm ; rm i , θ init ) v i ( ψ ( ( rm , rm i ) , θ init ) ; rm ) v i ( ψ ( ( rm , rm i , θ init ) ) ; rm ) 𝑙 superscript rm rm subscript rm 𝑖 subscript 𝜃 init subscript 𝑣 𝑖 𝜓 rm subscript rm 𝑖 subscript 𝜃 init rm subscript 𝑣 𝑖 𝜓 superscript rm subscript rm 𝑖 subscript 𝜃 init rm l(\text{rm}^{\prime},\text{rm};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\coloneqq v_{i}({\psi}((\text{rm},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}});\text{rm})-v_{i}({\psi}((\text{rm}^{\prime},\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}));\text{rm}) . l ( rm , rm ; rm i , θ init ) 𝑙 superscript rm rm subscript rm 𝑖 subscript 𝜃 init l(\text{rm}^{\prime},\text{rm};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) measures the valuation gains from misreporting ( rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} ) to truthfully reporting ( rm i subscript rm 𝑖 \text{rm}_{i} ) under rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} and θ init subscript 𝜃 init \theta_{\text{init}} . The cycle monotonicity is defined based on this function:

Definition 3.4 (Cycle Monotonicity) .

The training rule ψ 𝜓 {\psi} satisfies cycle monotonicity if for any rm i , rm i i subscript rm 𝑖 subscript superscript rm 𝑖 subscript 𝑖 \text{rm}_{i},\text{rm}^{\prime}_{i}\in\mathcal{R}_{i} , any finite and distinct sequence of reward models [ rm i , rm i 1 , rm i 2 , , rm i k , rm i ] subscript rm 𝑖 superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 2 superscript subscript rm 𝑖 𝑘 subscript superscript rm 𝑖 [\text{rm}_{i},\text{rm}_{i}^{1},\text{rm}_{i}^{2},\cdots,\text{rm}_{i}^{k},\text{rm}^{\prime}_{i}] , of agent i 𝑖 i ( k 0 𝑘 0 k\geq 0 ), and any rm i , θ init subscript rm 𝑖 subscript 𝜃 init \overrightarrow{\text{rm}}_{-i},\theta_{\text{init}} we have

j = 0 k + 1 l ( rm i j , rm i j + 1 ; rm i , θ init ) 0 rm i 0 = rm i k + 2 := rm i and rm i k + 1 := rm i . formulae-sequence superscript subscript 𝑗 0 𝑘 1 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init 0 superscript subscript rm 𝑖 0 superscript subscript rm 𝑖 𝑘 2 assign subscript rm 𝑖 superscript subscript and rm 𝑖 𝑘 1 assign subscript superscript rm 𝑖 \sum_{j=0}^{k+1}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq 0\quad\text{rm}_{i}^{0}=\text{rm}_{i}^{k+2}:=\text{rm}_{i}\text{ and }\text{rm}_{i}^{k+1}:=\text{rm}^{\prime}_{i}.

For general training rules, cycle monotonicity is a sufficient and necessary condition for implementability.

Theorem 3.5 ( Rochet ( 1987 ) ) .

A training rule ψ 𝜓 {\psi} is implementable if and only if it satisfies cycle monotonicity.

In fact, the proof of Theorem 3.5 is constructive. However, for general implementable training rules, the calculation of the payment rules is too complex to be practical.

The second question is more general, so we primarily consider the concept of payment equivalence ( (Ashlagi et al., 2010 ) ) for an implementable training rule.

Definition 3.6 (Payment Equivalence) .

An implementable training rule ψ 𝜓 {\psi} satisfies payment equivalence if for any two mechanisms ( ψ , p ) 𝜓 𝑝 ({\psi},p) and ( ψ , p ) 𝜓 superscript 𝑝 ({\psi},p^{\prime}) satisfying DSIC, there exists a function f 𝑓 f such that

p i ( rm i , rm i ; θ init ) = p i ( rm i , rm i ; θ init ) + f ( rm i , θ init ) rm i i . formulae-sequence subscript superscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑓 subscript rm 𝑖 subscript 𝜃 init for-all subscript rm 𝑖 subscript 𝑖 p^{\prime}_{i}(\text{rm}_{i},\overrightarrow{\text{rm}}_{-i};\theta_{\text{init}})=p_{i}(\text{rm}_{i},\overrightarrow{\text{rm}}_{-i};\theta_{\text{init}})+f(\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\quad\forall\text{rm}_{i}\in\mathcal{R}_{i}.

Or equivalently, when fixing rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} and θ init subscript 𝜃 init \theta_{\text{init}} , there exits a constant c 𝑐 c such that p i ( rm i ) = p i ( rm i ) + c subscript superscript 𝑝 𝑖 subscript rm 𝑖 subscript 𝑝 𝑖 subscript rm 𝑖 𝑐 p^{\prime}_{i}(\text{rm}_{i})=p_{i}(\text{rm}_{i})+c for all rm i i subscript rm 𝑖 subscript 𝑖 \text{rm}_{i}\in\mathcal{R}_{i} .

Payment equivalence indicates that the only way to modify a DSIC mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) to ( ψ , p ) 𝜓 superscript 𝑝 ({\psi},p^{\prime}) while maintaining incentive compatibility is to add a term that is independent of i 𝑖 i ’s report to agent i 𝑖 i ’s payment function p i subscript 𝑝 𝑖 p_{i} . Thus, the payment equivalence of ψ 𝜓 {\psi} is sometimes interpreted as the uniqueness of the payment rule p 𝑝 p that implements it in DSIC. This notion is strong and useful since when a training rule ψ 𝜓 {\psi} satisfies payment equivalence and we can figure out one mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) that satisfies DSIC, then all the payment rules p superscript 𝑝 p^{\prime} that implement ψ 𝜓 {\psi} in DSIC are characterized. In particular, it is possible to find the revenue-maximizing payment rule p superscript 𝑝 p^{*} among all these payment rules that implement ψ 𝜓 {\psi} in both DSIC and IR.

Payment equivalence is influenced by the domain of the types: reward models and group sizes in the RLHF Game. When w 1 𝑤 1 \vec{w}\equiv 1 , the agents only report the reward models whose domain \mathcal{R} contains all normalized reward models rm . Therefore, for all i [ n ] 𝑖 delimited-[] 𝑛 i\in[n] , the domain of the whole private information is exactly \mathcal{R} , which is a connected set in the Euclidean space. Thus, we can directly apply the result in Nisan et al. ( 2007 ) and get the following theorem.

Proposition 3.7 .

When w 1 𝑤 1 \vec{w}\equiv 1 is public information and the agents only report the reward models, all implementable training rules satisfy payment equivalence.

However, when the group sizes w 𝑤 \vec{w} is also a part of the private information for all groups, the domain of the whole private information becomes 𝒲 × 𝒲 \mathcal{W}\times\mathcal{R} that is no longer a connected set because 𝒲 + 𝒲 subscript \mathcal{W}\subseteq\mathbb{N}_{+} . Thus, payment equivalence may not be satisfied for general training rules, and we will study this for a representative set of training rules in the following section.

4 Social Welfare Maximizing Mechanism

In this section, we consider the scenario where group i 𝑖 i consists of w i subscript 𝑤 𝑖 w_{i} agents, and each group must simultaneously report its reward model and size. Our objective is to design a mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) that incentivizes each group i 𝑖 i to truthfully report both rm i subscript rm 𝑖 \text{rm}_{i} and w i subscript 𝑤 𝑖 w_{i} . For general training rules ψ 𝜓 {\psi} , though it is possible to adopt the method used in the constructive proof for Theorem 3.5 to derive the payment rule, the resulting payment rule can be complex and impractical.

Therefore, in this section, our primary focus is on a subset of training rules designed to maximize social welfare under regularization constraints, which is commonly used in practice to aggregate various preferences ( Boyd and Vandenberghe ( 2004 ); Nocedal and Wright ( 1999 ) ), balancing efficiency and fairness.

Definition 4.1 (SW-Maximizing Training Rules) .

Given the reports rm rm \overrightarrow{\text{rm}} , w 𝑤 \vec{w} , and the initial model θ init subscript 𝜃 init \theta_{\text{init}} , a SW-Maximizing training rule fine-tunes the model to maximize the social welfare under a regularization penalty measured by some metric D 𝐷 D . Formally, it is represented as:

ψ ( rm , w , θ init ) = arg max θ Θ i = 1 n w i v i ( θ ; rm i ) λ D ( LLM θ | | LLM θ init ) . {\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})=\arg\max_{\theta\in\Theta}\sum_{i=1}^{n}w_{i}v_{i}(\theta;\text{rm}_{i})-\lambda D(\text{LLM}_{\theta}||\text{LLM}_{\theta_{\text{init}}}).

Here, λ 𝜆 \lambda is a hyperparameter that adjusts regularization strength.

Note that SW-Maximizing training rules constitute a set of training rules. We use ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} to indicate that ψ 𝜓 {\psi} is a member of this set. Furthermore, similar to the (3) in 3.2 , we also assume that for all rm rm \overrightarrow{\text{rm}} , w 𝑤 \vec{w} and θ init subscript 𝜃 init \theta_{\text{init}} , the fine-tuned model θ = ψ ( rm , w , θ init ) 𝜃 𝜓 rm 𝑤 subscript 𝜃 init \theta={\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}}) satisfies that LLM θ ( 𝒙 ) > 0 subscript LLM 𝜃 𝒙 0 \text{LLM}_{\theta}({\bm{x}})>0 for 𝒙 T for-all 𝒙 superscript 𝑇 \forall{\bm{x}}\in T^{\ast} . One simple way to achieve it is to set a large λ 𝜆 \lambda and hence the training result is close enough to θ init subscript 𝜃 init \theta_{\text{init}} .

4.1 Affine Maximizer Payment

We introduce the affine maximizer payment rule ( Roberts ( 1979 ) ) p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} , a weighted version of VCG payment ( Vickrey ( 1961 ); Clarke ( 1971 ); Groves ( 1973 ) ):

p i A F F ( rm , w , θ init ) = A S W i superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 rm 𝑤 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 \displaystyle p_{i}^{AFF}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})=ASW_{-i} ( rm , w , ψ ( rm i , w i , θ init ) ; θ init ) rm 𝑤 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝜃 init \displaystyle(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}});\theta_{\text{init}})
A S W i ( rm , w , ψ ( rm , w , θ init ) ; θ init ) . 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 𝜓 rm 𝑤 subscript 𝜃 init subscript 𝜃 init \displaystyle-ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}});\theta_{\text{init}}).

The notations A S W ( rm , w , θ ; θ init ) 𝐴 𝑆 𝑊 rm 𝑤 𝜃 subscript 𝜃 init ASW(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}}) and A S W j ( rm , w , θ ; θ init ) 𝐴 𝑆 subscript 𝑊 𝑗 rm 𝑤 𝜃 subscript 𝜃 init ASW_{-j}(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}}) refer to the affine social welfare with and without group j 𝑗 j when the reported reward models are rm rm \overrightarrow{\text{rm}} , the reported number of agents are w 𝑤 \vec{w} , the initial model is LLM θ init subscript LLM subscript 𝜃 init \text{LLM}_{\theta_{\text{init}}} , and the parameters of model is θ 𝜃 \theta . The affine social welfare consists of both the groups’ valuations and the regularization term. Formally,

A S W ( rm , w , θ ; θ init ) 𝐴 𝑆 𝑊 rm 𝑤 𝜃 subscript 𝜃 init \displaystyle ASW(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}}) i = 1 n w i v i ( θ ; rm i ) λ D ( LLM θ | | LLM θ init ) , \displaystyle\coloneqq\sum_{i=1}^{n}w_{i}v_{i}(\theta;\text{rm}_{i})-\lambda D(\text{LLM}_{\theta}||\text{LLM}_{\theta_{\text{init}}}),
A S W j ( rm , w , θ ; θ init ) 𝐴 𝑆 subscript 𝑊 𝑗 rm 𝑤 𝜃 subscript 𝜃 init \displaystyle ASW_{-j}(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}}) i = 1 , i j n w i v i ( θ ; rm i ) λ D ( LLM θ | | LLM θ init ) . \displaystyle\coloneqq\sum_{i=1,i\neq j}^{n}w_{i}v_{i}(\theta;\text{rm}_{i})-\lambda D(\text{LLM}_{\theta}||\text{LLM}_{\theta_{\text{init}}}).

We show that p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} implements SW-Maximizing training rules in both DSIC and IR, which implies that truthfully reporting both reward models and group sizes constitutes a dominant Nash Equilibrium in this mechanism.

Theorem 4.2 .

For any ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} , mechanism ( ψ , p A F F ) 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 ({\psi},p^{AFF}) satisfies DSIC and IR.

Regarding payment equivalence, as we have mentioned in the previous section, the domain 𝒲 × 𝒲 \mathcal{W}\times\mathcal{R} is not connected in the Euclidean space since 𝒲 + 𝒲 subscript \mathcal{W}\subseteq\mathbb{N}_{+} , the results in Nisan et al. ( 2007 ) can not be directly applied. However, we show that under the following assumption, SW-Maximizing training rules satisfy payment equivalence.

Assumption 4.3 .

For any ϵ > 0 italic-ϵ 0 \epsilon>0 , there exists a δ > 0 𝛿 0 \delta>0 such that for any θ init subscript 𝜃 init \theta_{\text{init}} , rm rm \overrightarrow{\text{rm}} , rm superscript rm \overrightarrow{\text{rm}}^{\prime} , w 𝑤 \vec{w} and w superscript 𝑤 \vec{w}^{\prime} , if max 𝒙 T | i = 1 n ( w i rm i ( 𝒙 ) w i rm i ( 𝒙 ) ) | δ subscript 𝒙 superscript 𝑇 superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 subscript rm 𝑖 𝒙 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝒙 𝛿 \max_{{\bm{x}}\in T^{\ast}}|\sum_{i=1}^{n}\left(w_{i}\text{rm}_{i}({\bm{x}})-w^{\prime}_{i}\text{rm}^{\prime}_{i}({\bm{x}})\right)|\leq\delta , then max 𝒙 T | LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) | ϵ subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 subscript LLM superscript 𝜃 𝒙 italic-ϵ \max_{{\bm{x}}\in T^{\ast}}|\text{LLM}_{\theta}({\bm{x}})-\text{LLM}_{\theta^{\prime}}({\bm{x}})|\leq\epsilon , where θ := ψ ( rm , w , θ init ) assign 𝜃 𝜓 rm 𝑤 subscript 𝜃 init \theta:={\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}}) and θ := ( rm , w , θ init ) assign superscript 𝜃 superscript rm superscript 𝑤 subscript 𝜃 init \theta^{\prime}:=(\overrightarrow{\text{rm}}^{\prime},\vec{w}^{\prime},\theta_{\text{init}}) .

This assumption is reasonable for most measures D 𝐷 D in SW-Maximizing training rules as the space of θ 𝜃 \theta is continuous. The continuity ensures that when the reported information ( rm , w ) rm 𝑤 (\overrightarrow{\text{rm}},\vec{w}) and ( rm , w ) superscript rm superscript 𝑤 (\overrightarrow{\text{rm}}^{\prime},\vec{w}^{\prime}) are sufficiently close, the training outcomes θ 𝜃 \theta and θ superscript 𝜃 \theta^{\prime} should also be close. Specifically, we validate this assumption for some widely used distance measures.

Proposition 4.4 .

4.3 holds for SW-Maximizing training rules with regularizations KL-divergence, D KL ( p | | q ) = 𝐱 T p ( 𝐱 ) log p ( 𝐱 ) / q ( 𝐱 ) D_{\mathrm{KL}}(p||q)=\sum_{{\bm{x}}\in T^{\ast}}p({\bm{x}})\log p({\bm{x}})/q({\bm{x}}) , and L 2 subscript 𝐿 2 L_{2} distance, D 2 ( p | | q ) = 𝐱 T ( p ( 𝐱 ) q ( 𝐱 ) ) 2 D_{2}(p||q)=\sum_{{\bm{x}}\in T^{\ast}}(p({\bm{x}})-q({\bm{x}}))^{2} .

Under this assumption, we derive the following result:

Theorem 4.5 .

Under 3.1 and 4.3 , each training rule ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} satisfies payment equivalence.

With the property of payment equivalence, we further investigate the revenue-maximizing payment rule that implements SW-Maximizing training rules in both DSIC and IR. Finding the revenue-maximizing multi-parameter mechanism is a challenging problem in classic mechanism design theory. However, since we have proved the payment equivalence for SW-Maximizing training rules, we can utilize the necessary condition defined in Definition 3.6 to formulate it as a optimization problem. Solving this problem provides the optimal payment rule under the same conditions.

Corollary 4.6 .

Under 3.1 and 4.3 , for each training rule ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} , the revenue-maximizing payment rule p superscript 𝑝 p^{*} that implements ψ 𝜓 {\psi} in both DSIC and IR is given by

p i ( ( rm i , rm i ) , ( w i , w i ) , θ init ) subscript superscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \displaystyle p^{*}_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i}),\theta_{\text{init}}) = p i A F F ( ( rm i , rm i ) , ( w i , w i ) ; θ init ) absent superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \displaystyle=p_{i}^{AFF}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}})
+ inf rm i , w i 𝒲 u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p A F F , rm i , w i ) . subscript infimum formulae-sequence subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 𝒲 subscript 𝑢 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 \displaystyle+\inf_{\text{rm}^{\prime}_{i}\in\mathcal{R},w^{\prime}_{i}\in\mathcal{W}}u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p^{AFF},\text{rm}^{\prime}_{i},w^{\prime}_{i}).

The relationship between the domains × 𝒲 𝒲 \mathcal{R}\times\mathcal{W} , and this corollary is reflected in two aspects. First, the establishment of payment equivalence depends on the assumptions of the choice of \mathcal{R} , 𝒲 𝒲 \mathcal{W} , particularly considering \mathcal{R} includes all normalized reward models. Second, based on payment equivalence, finding the revenue-maximizing mechanism satisfying IR also needs information on the exact domains.

4.2 Approximate Valuation

In this part, we discuss the influence of error generated in practice on the incentive property in the RLHF Game. We abstract it as an approximate valuation problem ( Chiesa et al. ( 2012 ) ). Formally, when group i 𝑖 i reports its reward model rm i subscript rm 𝑖 \text{rm}_{i} , the mechanism may not use rm i subscript rm 𝑖 \text{rm}_{i} exactly but rather a noisy reward model rm ^ i subscript ^ rm 𝑖 \widehat{\text{rm}}_{i} with a conditional distribution F i ( | rm i ) F_{i}(\cdot|\text{rm}_{i}) as the input into the mechanism. We argue that this abstraction has covered various error cases. One example is that the calculation of valuation defined in 3.1 requires sampling sequences from LLM, which may result in a deviation from the true valuation. Another example is that the fine-tuned model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} may not be exactly optimal for the reported reward models. However, this model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} can be considered as the optimal for the deviated reward models.

We assume that agent groups are aware of the noise when feeding preferences into the mechanism. Therefore, their utilities will take it into account and have a different form. We use the capital letter U i subscript 𝑈 𝑖 U_{i} to represent agent i 𝑖 i ’s revised utility. Formally, for group i 𝑖 i with reward model rm i subscript rm 𝑖 \text{rm}_{i} and group size w i subscript 𝑤 𝑖 w_{i} , its utility for reporting ( rm i , w i ) subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 (\text{rm}^{\prime}_{i},w^{\prime}_{i}) is given by

U i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p , rm i , w i ) = 𝔼 rm ^ i F i ( | rm i ) u i ( ( rm ^ i , rm i ) , ( w i , w i ) ; ψ , p , rm i , w i ) . U_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p,\text{rm}_{i},w_{i})=\mathbb{E}_{\widehat{\text{rm}}_{i}\sim F_{i}(\cdot|\text{rm}^{\prime}_{i})}u_{i}((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p,\text{rm}_{i},w_{i}).

Note that in defining U i subscript 𝑈 𝑖 U_{i} , we implicitly assume that each group is unable to know the other group’s noise information. Therefore, the expectation is not taken concerning rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} .

We only consider the case when the noised input to the mechanism and the reported reward models are close:

Assumption 4.7 (Bounded Error) .

For any profile of reported reward models rm rm \overrightarrow{\text{rm}} , any profile of reward models rm ^ ^ rm \overrightarrow{\widehat{\text{rm}}} that can be generated from F i ( | rm i ) F_{i}(\cdot|\text{rm}_{i}) s with non-zero probability satisfies

max 𝒙 T | rm ^ i ( 𝒙 ) rm i ( 𝒙 ) | ϵ i [ n ] . formulae-sequence subscript 𝒙 superscript 𝑇 subscript ^ rm 𝑖 𝒙 subscript rm 𝑖 𝒙 italic-ϵ for-all 𝑖 delimited-[] 𝑛 \max_{{\bm{x}}\in T^{\ast}}|\widehat{\text{rm}}_{i}({\bm{x}})-\text{rm}_{i}({\bm{x}})|\leq\epsilon\quad\forall i\in[n].

We first show that by directly applying results in Section 4.1 to the noised input, the loss in the social welfare is upper-bounded by 2 ϵ i = 1 n w i 2 italic-ϵ superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 2\epsilon\sum_{i=1}^{n}w_{i} .

Lemma 4.8 .

Under 3.1 and 4.7 , when the training rule ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} , the loss in social welfare is bounded by

A S W ( rm , w , ψ ( rm ^ , w , θ init ) ; θ init ) A S W ( rm , w , ψ ( rm , w , θ init ) ; θ init ) 2 ϵ i = 1 n w i . 𝐴 𝑆 𝑊 rm 𝑤 𝜓 ^ rm 𝑤 subscript 𝜃 init subscript 𝜃 init 𝐴 𝑆 𝑊 rm 𝑤 𝜓 rm 𝑤 subscript 𝜃 init subscript 𝜃 init 2 italic-ϵ superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 \displaystyle ASW(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\widehat{\text{rm}}},\vec{w},\theta_{\text{init}});\theta_{\text{init}})\geq ASW(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}});\theta_{\text{init}})-2\epsilon\sum_{i=1}^{n}w_{i}.

For training rule ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} , a group’s utility in the mechanism ( ψ , p A F F ) 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 ({\psi},p^{AFF}) consists of an affine social welfare term A S W 𝐴 𝑆 𝑊 ASW . Therefore, we can derive the following theorem based on Lemma 4.8 .

Theorem 4.9 .

Under 3.1 and 4.7 , when the training rule ψ Ψ S W 𝜓 superscript Ψ 𝑆 𝑊 {\psi}\in\Psi^{SW} , for group i 𝑖 i and any rm i subscript rm 𝑖 \text{rm}_{i} , rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} , rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , w i subscript 𝑤 𝑖 w_{i} and w i subscript 𝑤 𝑖 \vec{w}_{i} , we have

U i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p A F F , rm i , w i ) U i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p A F F , rm i , w i ) 2 w i ϵ . subscript 𝑈 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑈 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript rm 𝑖 subscript 𝑤 𝑖 2 subscript 𝑤 𝑖 italic-ϵ \displaystyle U_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});{\psi},p^{AFF},\text{rm}_{i},w_{i})\geq U_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});{\psi},p^{AFF},\text{rm}_{i},w_{i})-2w_{i}\epsilon.

In other words, when w 𝑤 \vec{w} is truthfully reported, ( ψ , p A F F ) 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 ({\psi},p^{AFF}) is max i [ n ] 2 w i ϵ subscript 𝑖 delimited-[] 𝑛 2 subscript 𝑤 𝑖 italic-ϵ \max_{i\in[n]}2w_{i}\epsilon -DSIC mechanism.

This means that for any group i 𝑖 i , the maximum gain of misreporting is less than 2 w i ϵ 2 subscript 𝑤 𝑖 italic-ϵ 2w_{i}\epsilon regardless of the others’ reports. Agents will tend to truthfully report in cases where finding the optimal strategy is costlier than 2 w i ϵ 2 subscript 𝑤 𝑖 italic-ϵ 2w_{i}\epsilon .

5 Further Related Work

RLHF with Multiple Reward Models.

Research involving multiple reward models primarily focuses on developing algorithms to enhance practical performance. Some studies design methods to simultaneously satisfy multiple preferences ( Ramé et al. ( 2024 ); Wu et al. ( 2024 ); Jang et al. ( 2023 ); Park et al. ( 2024 ) ). Additionally, there is a body of work that trains multiple models for a single preference and then ensembles them to improve the robustness of RLHF ( Coste et al. ( 2023 ); Zhang et al. ( 2024 ) ), mitigate the influence of incorrect and ambiguous preferences in the dataset ( Wang et al. ( 2024 ) ), and reduce reward hacking ( Eisenstein et al. ( 2023 ) ). Unlike these approaches, our work considers how to collect misaligned preferences truthfully from different agents.

Multi-parameter Auctions.

Several studies have explored the properties relevant to our paper in various multi-parameter auction scenarios, such as implementability ( Rochet ( 1987 ); Miyake ( 1998 ); Conitzer and Sandholm ( 2004 ); Saks and Yu ( 2005 ); Bikhchandani et al. ( 2006 ); Ashlagi et al. ( 2010 ) ) and payment equivalence ( Ivanova-Stenzel and Salmon ( 2008 ); Heydenreich et al. ( 2009 ); Bergemann and Välimäki ( 2010 ); Pavan et al. ( 2014 ) ). Another central topic in auction theory is to design mechanisms that satisfy DSIC and IR while maximizing the expected revenue for the auctioneer. Although the single-parameter scenario has been resolved by Myerson ( 1981 ) , the optimal auction design for multi-parameter settings remains an open question. Therefore, there is a stream of research focusing on a specific subset, affine maximizer auctions, which inherently satisfy DSIC and IR ( Sandholm and Likhodedov ( 2015 ); Roberts ( 1979 ); Likhodedov and Sandholm ( 2004 ); Briest et al. ( 2010 ); Tang and Sandholm ( 2012 ); Jehiel et al. ( 2007 ) ), and proposes optimizations to enhance empirical performance ( Curry et al. ( 2022 ); Duan et al. ( 2024a , b ) ). Compared to these works, we are the first to discuss the property of payment equivalence and the revenue-maximizing solution in the scenario of fine-tuning LLMs.

Game Theory and LLMs.

Other works also explored the intersection of game theory and large language models. Some research has proposed algorithms for training LLMs inspired by concepts in game theory, such as Nash learning from human feedback ( Munos et al. ( 2023 ) ), consensus game ( Jacob et al. ( 2023 ) ), and direct Nash optimization ( Rosset et al. ( 2024 ) ), and Gemp et al. ( 2024 ) . Furthermore, various studies assess LLMs from a game-theoretical perspective, examining aspects such as rationality ( Chen et al. ( 2023 ); Fan et al. ( 2023 ) ), behavior in matrix games ( Akata et al. ( 2023 ); Gandhi et al. ( 2023 ); Lorè and Heydari ( 2023 ) ), and performance in strategic games like auctions ( Guo et al. ( 2023 , 2024 ) ), Werewolf ( Xu et al. ( 2023a , b ) ), and Avalon ( Wang et al. ( 2023a ) ).

6 Discussion and Conclusion

Efficient Practical Implementation of p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} .

In the RLHF Game with n 𝑛 n groups, calculating p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} requires n 𝑛 n separate complete training processes of different ψ 𝜓 {\psi} s. This can result in inefficiency due to the costly training. To address this problem, we propose two modifications to p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} . Both modifications involve computing an approximate ψ ^ ( rm i , w i , θ init ) ^ 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \widehat{{\psi}}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) , instead of the true optimal ψ ( rm i , w i , θ init ) 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init {\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) when calculating payments:

  1. 1.

    Calculate an approximate ψ ^ ( rm i , w i , θ init ) = arg max θ { θ 1 , , θ K } A S W i ( rm , w , θ ; θ init ) ^ 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝜃 subscript 𝜃 1 subscript 𝜃 𝐾 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 𝜃 subscript 𝜃 init \widehat{{\psi}}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}})=\arg\max_{\theta\in\{\theta_{1},\cdots,\theta_{K}\}}ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}}) , where { θ 1 , , θ K } subscript 𝜃 1 subscript 𝜃 𝐾 \{\theta_{1},\cdots,\theta_{K}\} are the parameters saved in the process of training ψ ( rm , w , θ init ) 𝜓 rm 𝑤 subscript 𝜃 init {\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}}) .

  2. 2.

    Adopt less iterations in the training process for calculating ψ ( rm i , w i , θ init ) 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init {\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) . And thus get a result ψ ^ ( rm i , w i , θ init ) ^ 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \widehat{{\psi}}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) that is not optimal.

The first method needs only one training process (for ψ ( rm , w , θ init ) 𝜓 rm 𝑤 subscript 𝜃 init {\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}}) ) but affects the property of DSIC since the saved parameters { θ 1 , , θ K } subscript 𝜃 1 subscript 𝜃 𝐾 \{\theta_{1},\cdots,\theta_{K}\} are also influenced i 𝑖 i ’s report. In comparison, the second approach incurs higher training costs but guarantees strict DSIC.

Conclusion and Future Work.

This paper investigates incentive design in fine-tuning large language models using multiple reward models. We formalize this scenario as the RLHF Game, where a service provider proposes training and payment rules, and agents strategically report their preferences. We demonstrate the necessity of payment schemes for incentivizing truthful reporting in general training rules and provide a comprehensive characterization of payment schemes that implement SW-Maximizing training rules in dominant strategies. These findings enhance the theoretical understanding of mechanism design in LLM fine-tuning and offer guidelines for implementing effective RLHF-based systems in various contexts.

Future research in this field presents several promising directions. Firstly, investigating mechanisms integrating efficiency and incentive compatibility within the RLHF Game could significantly enhance its applicability in real-world scenarios. Secondly, modeling and examining more complex training rules, such as dynamic training rules, could deepen the understanding of this framework. Thirdly, designing mechanisms for more general cases that aggregate preferences into multiple models based on diversity considerations is crucial. Additionally, applying mechanism design theory to other scenarios related to large language models, such as API charge schemes, retrieval-augmented generation (RAG), and prompt engineering, offers valuable opportunities for further exploration.

References

Appendix A Omitted proof in Section 3

See 3.3

Proof.

For the case that w = 1 𝑤 1 \vec{w}=1 , the optimization of ψ 𝜓 {\psi} can be written as a programming problem:

ψ ( rm , θ init ) arg max θ Θ f ( v 1 ( θ ; rm 1 ) , \displaystyle{\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}})\coloneqq\arg\max_{\theta\in\Theta}\quad f(v_{1}(\theta;\text{rm}_{1}),\cdots , v n ( θ ; rm n ) , D ( LLM θ | | LLM θ init ) ) \displaystyle,v_{n}(\theta;\text{rm}_{n}),D(\text{LLM}_{\theta}||\text{LLM}_{\theta_{\text{init}}}))
s.t. 𝒙 T LLM θ ( 𝒙 ) s.t. subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 \displaystyle\text{s.t.}\quad\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}}) = 1 absent 1 \displaystyle=1
LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \displaystyle\text{LLM}_{\theta}({\bm{x}}) 0 𝒙 T formulae-sequence absent 0 for-all 𝒙 superscript 𝑇 \displaystyle\geq 0\quad\forall{\bm{x}}\in T^{\ast}

Because of the (3) of 3.2 , we can infer that the condition LLM θ ( 𝒙 ) 0 subscript LLM 𝜃 𝒙 0 \text{LLM}_{\theta}({\bm{x}})\geq 0 , 𝒙 T for-all 𝒙 superscript 𝑇 \forall{\bm{x}}\in T^{\ast} is not active for the optimal solution. Further, by the (1) of 3.2 , all the partial derivatives exist. Since the optimal solution is also a local extreme point, the necessary condition for the optimal θ 𝜃 \theta is that there exists μ 𝜇 \mu\in\mathbb{R} ( Luenberger et al. [ 1984 ] ), such that

i = 1 n ( f v i v i LLM θ ( 𝒙 ) ) + f D D LLM θ ( 𝒙 ) = μ 𝒙 T . formulae-sequence superscript subscript 𝑖 1 𝑛 𝑓 subscript 𝑣 𝑖 subscript 𝑣 𝑖 subscript LLM 𝜃 𝒙 𝑓 𝐷 𝐷 subscript LLM 𝜃 𝒙 𝜇 for-all 𝒙 superscript 𝑇 \sum_{i=1}^{n}\left(\frac{\partial f}{\partial v_{i}}\frac{\partial v_{i}}{\partial\text{LLM}_{\theta}({\bm{x}})}\right)+\frac{\partial f}{\partial D}\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}})}=\mu\quad\forall{\bm{x}}\in T^{\ast}.

Under 3.1 , v i LLM θ ( 𝒙 ) = rm i ( 𝒙 ) subscript 𝑣 𝑖 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 \frac{\partial v_{i}}{\partial\text{LLM}_{\theta}({\bm{x}})}=\text{rm}_{i}({\bm{x}}) , so we have

i = 1 n ( f v i rm i ( 𝒙 ) ) + f D D LLM θ ( 𝒙 ) = μ 𝒙 T . formulae-sequence superscript subscript 𝑖 1 𝑛 𝑓 subscript 𝑣 𝑖 subscript rm 𝑖 𝒙 𝑓 𝐷 𝐷 subscript LLM 𝜃 𝒙 𝜇 for-all 𝒙 superscript 𝑇 \sum_{i=1}^{n}\left(\frac{\partial f}{\partial v_{i}}\text{rm}_{i}({\bm{x}})\right)+\frac{\partial f}{\partial D}\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}})}=\mu\quad\forall{\bm{x}}\in T^{\ast}. (1)

When the real reward model for agent i 𝑖 i is rm i subscript rm 𝑖 \text{rm}_{i} , we construct a report reward model rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} and show that v i ( ψ ( ( rm i , rm i ) ) , θ init ) ; rm i ) > v i ( ψ ( rm , θ init ) ; rm i ) v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i})),\theta_{\text{init}});\text{rm}_{i})>v_{i}({\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}});\text{rm}_{i}) . We denote S 𝑆 S as the set of sequence 𝒙 𝒙 {\bm{x}} with non-zero reward value, i.e. S = { 𝒙 : rm i ( 𝒙 ) > 0 } 𝑆 conditional-set 𝒙 subscript rm 𝑖 𝒙 0 S=\{{\bm{x}}:\text{rm}_{i}({\bm{x}})>0\} and take the 𝒙 1 arg max 𝒙 S rm i ( 𝒙 ) , 𝒙 2 arg min 𝒙 S rm i ( 𝒙 ) formulae-sequence subscript 𝒙 1 subscript 𝒙 𝑆 subscript rm 𝑖 𝒙 subscript 𝒙 2 subscript 𝒙 𝑆 subscript rm 𝑖 𝒙 {\bm{x}}_{1}\in\arg\max_{{\bm{x}}\in S}\text{rm}_{i}({\bm{x}}),{\bm{x}}_{2}\in\arg\min_{{\bm{x}}\in S}\text{rm}_{i}({\bm{x}}) . Then we take a small ϵ < rm i ( 𝒙 2 ) italic-ϵ subscript rm 𝑖 subscript 𝒙 2 \epsilon<\text{rm}_{i}({\bm{x}}_{2}) and define rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} as:

rm i ( 𝒙 ) = { rm i ( 𝒙 ) + ϵ , 𝒙 = 𝒙 1 , rm i ( 𝒙 ) ϵ , 𝒙 = 𝒙 2 rm i ( 𝒙 ) , 𝒙 𝒙 1 , 𝒙 𝒙 2 . subscript superscript rm 𝑖 𝒙 cases subscript rm 𝑖 𝒙 italic-ϵ 𝒙 subscript 𝒙 1 subscript rm 𝑖 𝒙 italic-ϵ 𝒙 subscript 𝒙 2 subscript rm 𝑖 𝒙 formulae-sequence 𝒙 subscript 𝒙 1 𝒙 subscript 𝒙 2 \displaystyle\text{rm}^{\prime}_{i}({\bm{x}})=\begin{cases}\text{rm}_{i}({\bm{x}})+\epsilon,&{\bm{x}}={\bm{x}}_{1},\\ \text{rm}_{i}({\bm{x}})-\epsilon,&{\bm{x}}={\bm{x}}_{2}\\ \text{rm}_{i}({\bm{x}}),&{\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}\neq{\bm{x}}_{2}.\end{cases}

Intuitively, rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} assigns more value to the element with the highest rm i subscript rm 𝑖 \text{rm}_{i} value, and less to the element with the lowest non-zero rm i subscript rm 𝑖 \text{rm}_{i} value. Let θ = ψ ( rm , θ init ) 𝜃 𝜓 rm subscript 𝜃 init \theta={\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}}) and θ = ψ ( ( rm i , rm i ) ) , θ init ) \theta^{\prime}={\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i})),\theta_{\text{init}}) , we analyze the variation of the corresponding policies LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} and LLM θ subscript LLM superscript 𝜃 \text{LLM}_{\theta^{\prime}} . We use μ 𝜇 \mu and μ superscript 𝜇 \mu^{\prime} to denote the variable in the necessary condition for LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} and LLM θ subscript LLM superscript 𝜃 \text{LLM}_{\theta^{\prime}} and we can derive the following results.

(a) LLM θ ( 𝒙 1 ) > LLM θ ( 𝒙 1 ) subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})>\text{LLM}_{\theta}({\bm{x}}_{1}) and LLM θ ( 𝒙 2 ) < LLM θ ( 𝒙 2 ) subscript LLM superscript 𝜃 subscript 𝒙 2 subscript LLM 𝜃 subscript 𝒙 2 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})<\text{LLM}_{\theta}({\bm{x}}_{2}) . We prove the former by contradiction: if LLM θ ( 𝒙 1 ) LLM θ ( 𝒙 1 ) subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})\leq\text{LLM}_{\theta}({\bm{x}}_{1}) , then by the assumption of 2 D / LLM θ ( 𝒙 ) 2 > 0 superscript 2 𝐷 subscript LLM 𝜃 superscript 𝒙 2 0 \partial^{2}D/\partial\text{LLM}_{\theta}({\bm{x}})^{2}>0 , we have

D LLM θ ( 𝒙 1 ) D LLM θ ( 𝒙 1 ) . 𝐷 subscript LLM superscript 𝜃 subscript 𝒙 1 𝐷 subscript LLM 𝜃 subscript 𝒙 1 \frac{\partial D}{\partial\text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})}\leq\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}}_{1})}.

With rm i ( 𝒙 1 ) > rm i ( 𝒙 1 ) subscript superscript rm 𝑖 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 1 \text{rm}^{\prime}_{i}({\bm{x}}_{1})>\text{rm}_{i}({\bm{x}}_{1}) , f / v i > 0 𝑓 subscript 𝑣 𝑖 0 \partial f/\partial v_{i}>0 and f / D < 0 𝑓 𝐷 0 \partial f/\partial D<0 , we can infer that μ > μ superscript 𝜇 𝜇 \mu^{\prime}>\mu . However, since for all 𝒙 𝒙 1 𝒙 subscript 𝒙 1 {\bm{x}}\neq{\bm{x}}_{1} , we have

f v i rm i ( 𝒙 ) f v i rm i ( 𝒙 ) , 𝑓 subscript 𝑣 𝑖 subscript rm 𝑖 𝒙 𝑓 subscript 𝑣 𝑖 subscript superscript rm 𝑖 𝒙 \frac{\partial f}{\partial v_{i}}\text{rm}_{i}({\bm{x}})\leq\frac{\partial f}{\partial v_{i}}\text{rm}^{\prime}_{i}({\bm{x}}),

to satisfy the optimal condition in (1), there must be for all 𝒙 𝒙 1 𝒙 subscript 𝒙 1 {\bm{x}}\neq{\bm{x}}_{1} ,

D LLM θ ( 𝒙 ) < D LLM θ ( 𝒙 ) . 𝐷 subscript LLM superscript 𝜃 𝒙 𝐷 subscript LLM 𝜃 𝒙 \frac{\partial D}{\partial\text{LLM}_{\theta^{\prime}}({\bm{x}})}<\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}})}.

Which is equivalent to LLM θ ( 𝒙 ) < LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}})<\text{LLM}_{\theta}({\bm{x}}) , and hence results in 𝒙 T LLM θ ( 𝒙 ) < 𝒙 T LLM θ ( 𝒙 ) = 1 subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 1 \sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta^{\prime}}({\bm{x}})<\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})=1 . And the latter, LLM θ ( 𝒙 2 ) < LLM θ ( 𝒙 2 ) subscript LLM superscript 𝜃 subscript 𝒙 2 subscript LLM 𝜃 subscript 𝒙 2 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})<\text{LLM}_{\theta}({\bm{x}}_{2}) , can be proved by totally same method.

(b) The order of LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \text{LLM}_{\theta}({\bm{x}}) and LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}}) for all 𝒙 𝒙 1 , 𝒙 2 𝒙 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}_{2} is consistent. Without loss of generality, we assume there is 𝒙 3 𝒙 1 , 𝒙 2 subscript 𝒙 3 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}_{3}\neq{\bm{x}}_{1},{\bm{x}}_{2} such that LLM θ ( 𝒙 3 ) LLM θ ( 𝒙 3 ) subscript LLM superscript 𝜃 subscript 𝒙 3 subscript LLM 𝜃 subscript 𝒙 3 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{3})\geq\text{LLM}_{\theta}({\bm{x}}_{3}) . Then we have

D LLM θ ( 𝒙 3 ) D LLM θ ( 𝒙 3 ) . 𝐷 subscript LLM superscript 𝜃 subscript 𝒙 3 𝐷 subscript LLM 𝜃 subscript 𝒙 3 \frac{\partial D}{\partial\text{LLM}_{\theta^{\prime}}({\bm{x}}_{3})}\geq\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}}_{3})}.

Since f / D < 0 𝑓 𝐷 0 \partial f/\partial D<0 , we can infer that μ μ superscript 𝜇 𝜇 \mu^{\prime}\leq\mu . Then for all 𝒙 𝒙 1 , 𝒙 2 𝒙 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}_{2} , to satisfy (1), there must be

D LLM θ ( 𝒙 ) D LLM θ ( 𝒙 ) 𝐷 subscript LLM superscript 𝜃 𝒙 𝐷 subscript LLM 𝜃 𝒙 \frac{\partial D}{\partial\text{LLM}_{\theta^{\prime}}({\bm{x}})}\geq\frac{\partial D}{\partial\text{LLM}_{\theta}({\bm{x}})}

which is equivalent to LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}})\geq\text{LLM}_{\theta}({\bm{x}}) . Similarly, if there is 𝒙 3 𝒙 1 , 𝒙 2 subscript 𝒙 3 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}_{3}\neq{\bm{x}}_{1},{\bm{x}}_{2} such that LLM θ ( 𝒙 3 ) LLM θ ( 𝒙 3 ) subscript LLM superscript 𝜃 subscript 𝒙 3 subscript LLM 𝜃 subscript 𝒙 3 \text{LLM}_{\theta^{\prime}}({\bm{x}}_{3})\leq\text{LLM}_{\theta}({\bm{x}}_{3}) , then for all 𝒙 𝒙 1 , 𝒙 2 𝒙 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}_{2} , there is LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}})\leq\text{LLM}_{\theta}({\bm{x}}) .

Finally, with the results in (a) and (b), when LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}})\leq\text{LLM}_{\theta}({\bm{x}}) for all 𝒙 𝒙 1 , 𝒙 2 𝒙 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}_{2} , there is

v i ( ψ ( ( rm i , rm i ) ) , θ init ) ; rm i ) v i ( ψ ( rm , θ init ) ; rm i ) \displaystyle v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i})),\theta_{\text{init}});\text{rm}_{i})-v_{i}({\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}});\text{rm}_{i})
= \displaystyle= 𝒙 T ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 \displaystyle\sum_{{\bm{x}}\in T^{\ast}}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})
= \displaystyle= 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) subscript 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 \displaystyle\sum_{{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})
= \displaystyle= 𝒙 𝒙 1 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) + ( LLM θ ( 𝒙 1 ) LLM θ ( 𝒙 1 ) ) rm i ( 𝒙 1 ) subscript formulae-sequence 𝒙 subscript 𝒙 1 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 1 \displaystyle\sum_{{\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})+\left(\text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})-\text{LLM}_{\theta}({\bm{x}}_{1})\right)\text{rm}_{i}({\bm{x}}_{1})
= \displaystyle= 𝒙 𝒙 1 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) + ( LLM θ ( 𝒙 1 ) LLM θ ( 𝒙 1 ) ) rm i ( 𝒙 1 ) subscript formulae-sequence 𝒙 subscript 𝒙 1 𝒙 𝑆 subscript LLM 𝜃 𝒙 subscript LLM superscript 𝜃 𝒙 subscript rm 𝑖 𝒙 subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 1 \displaystyle-\sum_{{\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}\in S}\left(\text{LLM}_{\theta}({\bm{x}})-\text{LLM}_{\theta^{\prime}}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})+\left(\text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})-\text{LLM}_{\theta}({\bm{x}}_{1})\right)\text{rm}_{i}({\bm{x}}_{1})
( 2 ) 2 \displaystyle\overset{(2)}{\geq} 𝒙 𝒙 1 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 1 ) + ( LLM θ ( 𝒙 1 ) LLM θ ( 𝒙 1 ) ) rm i ( 𝒙 1 ) subscript formulae-sequence 𝒙 subscript 𝒙 1 𝒙 𝑆 subscript LLM 𝜃 𝒙 subscript LLM superscript 𝜃 𝒙 subscript rm 𝑖 subscript 𝒙 1 subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 1 \displaystyle-\sum_{{\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}\in S}\left(\text{LLM}_{\theta}({\bm{x}})-\text{LLM}_{\theta^{\prime}}({\bm{x}})\right)\text{rm}_{i}({\bm{x}}_{1})+\left(\text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})-\text{LLM}_{\theta}({\bm{x}}_{1})\right)\text{rm}_{i}({\bm{x}}_{1})
= \displaystyle= rm i ( 𝒙 1 ) ( LLM θ ( 𝒙 1 ) LLM θ ( 𝒙 1 ) 𝒙 𝒙 1 ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) ) subscript rm 𝑖 subscript 𝒙 1 subscript LLM superscript 𝜃 subscript 𝒙 1 subscript LLM 𝜃 subscript 𝒙 1 subscript 𝒙 subscript 𝒙 1 subscript LLM 𝜃 𝒙 subscript LLM superscript 𝜃 𝒙 \displaystyle\text{rm}_{i}({\bm{x}}_{1})\left(\text{LLM}_{\theta^{\prime}}({\bm{x}}_{1})-\text{LLM}_{\theta}({\bm{x}}_{1})-\sum_{{\bm{x}}\neq{\bm{x}}_{1}}\left(\text{LLM}_{\theta}({\bm{x}})-\text{LLM}_{\theta^{\prime}}({\bm{x}})\right)\right)
= \displaystyle= rm i ( 𝒙 1 ) 𝒙 T ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) = 0 . subscript rm 𝑖 subscript 𝒙 1 subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 0 \displaystyle\text{rm}_{i}({\bm{x}}_{1})\sum_{{\bm{x}}\in T^{\ast}}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)=0.

When LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}})\geq\text{LLM}_{\theta}({\bm{x}}) for all 𝒙 𝒙 1 , 𝒙 2 𝒙 subscript 𝒙 1 subscript 𝒙 2 {\bm{x}}\neq{\bm{x}}_{1},{\bm{x}}_{2} , there is

v i ( ψ ( ( rm i , rm i ) ) , θ init ) ; rm i ) v i ( ψ ( rm , θ init ) ; rm i ) \displaystyle v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i})),\theta_{\text{init}});\text{rm}_{i})-v_{i}({\psi}(\overrightarrow{\text{rm}},\theta_{\text{init}});\text{rm}_{i})
= \displaystyle= 𝒙 T ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 \displaystyle\sum_{{\bm{x}}\in T^{\ast}}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})
= \displaystyle= 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) subscript 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 \displaystyle\sum_{{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})
= \displaystyle= 𝒙 𝒙 2 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) + ( LLM θ ( 𝒙 2 ) LLM θ ( 𝒙 2 ) ) rm i ( 𝒙 2 ) subscript formulae-sequence 𝒙 subscript 𝒙 2 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 subscript LLM superscript 𝜃 subscript 𝒙 2 subscript LLM 𝜃 subscript 𝒙 2 subscript rm 𝑖 subscript 𝒙 2 \displaystyle\sum_{{\bm{x}}\neq{\bm{x}}_{2},{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})+\left(\text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})-\text{LLM}_{\theta}({\bm{x}}_{2})\right)\text{rm}_{i}({\bm{x}}_{2})
= \displaystyle= 𝒙 𝒙 2 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 ) ( LLM θ ( 𝒙 2 ) LLM θ ( 𝒙 2 ) ) rm i ( 𝒙 2 ) subscript formulae-sequence 𝒙 subscript 𝒙 2 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 𝒙 subscript LLM 𝜃 subscript 𝒙 2 subscript LLM superscript 𝜃 subscript 𝒙 2 subscript rm 𝑖 subscript 𝒙 2 \displaystyle\sum_{{\bm{x}}\neq{\bm{x}}_{2},{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}})-\left(\text{LLM}_{\theta}({\bm{x}}_{2})-\text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})\right)\text{rm}_{i}({\bm{x}}_{2})
( 3 ) 3 \displaystyle\overset{(3)}{\geq} 𝒙 𝒙 2 , 𝒙 S ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) rm i ( 𝒙 2 ) ( LLM θ ( 𝒙 2 ) LLM θ ( 𝒙 2 ) ) rm i ( 𝒙 2 ) subscript formulae-sequence 𝒙 subscript 𝒙 2 𝒙 𝑆 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript rm 𝑖 subscript 𝒙 2 subscript LLM 𝜃 subscript 𝒙 2 subscript LLM superscript 𝜃 subscript 𝒙 2 subscript rm 𝑖 subscript 𝒙 2 \displaystyle\sum_{{\bm{x}}\neq{\bm{x}}_{2},{\bm{x}}\in S}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)\text{rm}_{i}({\bm{x}}_{2})-\left(\text{LLM}_{\theta}({\bm{x}}_{2})-\text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})\right)\text{rm}_{i}({\bm{x}}_{2})
= \displaystyle= rm i ( 𝒙 2 ) ( 𝒙 𝒙 2 ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) ( LLM θ ( 𝒙 2 ) LLM θ ( 𝒙 2 ) ) ) subscript rm 𝑖 subscript 𝒙 2 subscript 𝒙 subscript 𝒙 2 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript LLM 𝜃 subscript 𝒙 2 subscript LLM superscript 𝜃 subscript 𝒙 2 \displaystyle\text{rm}_{i}({\bm{x}}_{2})\left(\sum_{{\bm{x}}\neq{\bm{x}}_{2}}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)-\left(\text{LLM}_{\theta}({\bm{x}}_{2})-\text{LLM}_{\theta^{\prime}}({\bm{x}}_{2})\right)\right)
= \displaystyle= rm i ( 𝒙 2 ) 𝒙 T ( LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) ) = 0 . subscript rm 𝑖 subscript 𝒙 2 subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 0 \displaystyle\text{rm}_{i}({\bm{x}}_{2})\sum_{{\bm{x}}\in T^{\ast}}\left(\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})\right)=0.

Note that both (2) and (3) are because of rm i ( 𝒙 1 ) rm i ( 𝒙 2 ) subscript rm 𝑖 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 2 \text{rm}_{i}({\bm{x}}_{1})\geq\text{rm}_{i}({\bm{x}}_{2}) . And unless rm i ( 𝒙 1 ) = rm i ( 𝒙 2 ) subscript rm 𝑖 subscript 𝒙 1 subscript rm 𝑖 subscript 𝒙 2 \text{rm}_{i}({\bm{x}}_{1})=\text{rm}_{i}({\bm{x}}_{2}) , which means that all 𝒙 S 𝒙 𝑆 {\bm{x}}\in S have the same reward value rm i ( 𝒙 ) subscript rm 𝑖 𝒙 \text{rm}_{i}({\bm{x}}) , the “ > > ”s are hold. ∎

See 3.5

Proof.

We first prove the necessity: if ψ 𝜓 {\psi} is implementable, it satisfies cycle monotonicity. Since ψ 𝜓 {\psi} is implementable, there exists p 𝑝 p such that ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies DSIC. Then for any rm i , rm i i subscript rm 𝑖 subscript superscript rm 𝑖 subscript 𝑖 \text{rm}_{i},\text{rm}^{\prime}_{i}\in\mathcal{R}_{i} , rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , and θ init subscript 𝜃 init \theta_{\text{init}} any finite and distinct sequence of reward models [ rm i , rm i 1 , rm i 2 , , rm i k , rm i ] subscript rm 𝑖 superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 2 superscript subscript rm 𝑖 𝑘 subscript superscript rm 𝑖 [\text{rm}_{i},\text{rm}_{i}^{1},\text{rm}_{i}^{2},\cdots,\text{rm}_{i}^{k},\text{rm}^{\prime}_{i}] , k 0 𝑘 0 k\geq 0 , we let rm i 0 = rm i k + 2 := rm i superscript subscript rm 𝑖 0 superscript subscript rm 𝑖 𝑘 2 assign subscript rm 𝑖 \text{rm}_{i}^{0}=\text{rm}_{i}^{k+2}:=\text{rm}_{i} and rm i k + 1 := rm i assign superscript subscript rm 𝑖 𝑘 1 subscript superscript rm 𝑖 \text{rm}_{i}^{k+1}:=\text{rm}^{\prime}_{i} . By the property of DSIC, we have

v i ( ψ ( ( rm i j + 1 , \displaystyle v_{i}({\psi}((\text{rm}_{i}^{j+1}, rm i ) , θ init ) ; rm i j + 1 ) p i ( ( rm i j + 1 , rm i ) , θ init ) \displaystyle\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}});\text{rm}_{i}^{j+1})-p_{i}((\text{rm}_{i}^{j+1},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})\geq
v i ( ψ ( ( rm i j , rm i ) , θ init ) ; rm i j + 1 ) p i ( ( rm i j , rm i ) , θ init ) 0 j k + 1 . subscript 𝑣 𝑖 𝜓 superscript subscript rm 𝑖 𝑗 subscript rm 𝑖 subscript 𝜃 init superscript subscript rm 𝑖 𝑗 1 subscript 𝑝 𝑖 superscript subscript rm 𝑖 𝑗 subscript rm 𝑖 subscript 𝜃 init for-all 0 𝑗 𝑘 1 \displaystyle v_{i}({\psi}((\text{rm}_{i}^{j},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}});\text{rm}_{i}^{j+1})-p_{i}((\text{rm}_{i}^{j},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})\quad\forall 0\leq j\leq k+1.

By definition of the function l 𝑙 l , this is equivalent to

l ( rm i j , rm i j + 1 ; rm i , θ init ) p i ( ( rm i j + 1 , rm i ) , θ init ) p i ( ( rm i j , rm i ) , θ init ) 0 j k + 1 . formulae-sequence 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init subscript 𝑝 𝑖 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init subscript 𝑝 𝑖 superscript subscript rm 𝑖 𝑗 subscript rm 𝑖 subscript 𝜃 init for-all 0 𝑗 𝑘 1 l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq p_{i}((\text{rm}_{i}^{j+1},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})-p_{i}((\text{rm}_{i}^{j},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})\quad\forall 0\leq j\leq k+1.

Sum over all j 𝑗 j , we get

j = 0 k + 1 l ( rm i j , rm i j + 1 ; rm i , θ init ) j = 0 k + 1 ( p i ( ( rm i j + 1 , rm i ) , θ init ) p i ( ( rm i j , rm i ) , θ init ) ) = 0 . superscript subscript 𝑗 0 𝑘 1 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init superscript subscript 𝑗 0 𝑘 1 subscript 𝑝 𝑖 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init subscript 𝑝 𝑖 superscript subscript rm 𝑖 𝑗 subscript rm 𝑖 subscript 𝜃 init 0 \sum_{j=0}^{k+1}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq\sum_{j=0}^{k+1}\left(p_{i}((\text{rm}_{i}^{j+1},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})-p_{i}((\text{rm}_{i}^{j},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})\right)=0.

This means that ψ 𝜓 {\psi} satisfies cycle monotonicity.

Then we prove the sufficiency: if ψ 𝜓 {\psi} satisfies cycle monotonicity, it is implementable. For any two rm i , rm i i subscript rm 𝑖 subscript superscript rm 𝑖 subscript 𝑖 \text{rm}_{i},\text{rm}^{\prime}_{i}\in\mathcal{R}_{i} , we use function V 𝑉 V to denote the shortest length measured by l 𝑙 l for any sequence from rm i subscript rm 𝑖 \text{rm}_{i} to rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} . In formal,

V ( rm i , rm i ; rm i , θ init ) inf A finite and distinct sequence [ rm i 0 := rm i , rm i 1 , , rm i k , rm i k + 1 := rm i ] j = 0 k l ( rm i j , rm i j + 1 ; rm i , θ init ) . 𝑉 subscript rm 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript infimum A finite and distinct sequence delimited-[] formulae-sequence assign superscript subscript rm 𝑖 0 subscript rm 𝑖 superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 𝑘 assign superscript subscript rm 𝑖 𝑘 1 subscript superscript rm 𝑖 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init V(\text{rm}_{i},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\coloneqq\inf_{\begin{subarray}{c}\text{A finite and distinct sequence}\\ [\text{rm}_{i}^{0}:=\text{rm}_{i},\text{rm}_{i}^{1},\cdots,\text{rm}_{i}^{k},\text{rm}_{i}^{k+1}:=\text{rm}^{\prime}_{i}]\end{subarray}}\sum_{j=0}^{k}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}).

By cycle monotonicity, we have that for any finite and distinct sequence [ rm i , rm i 1 , rm i 2 , , rm i k , rm i ] subscript rm 𝑖 superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 2 superscript subscript rm 𝑖 𝑘 subscript superscript rm 𝑖 [\text{rm}_{i},\text{rm}_{i}^{1},\text{rm}_{i}^{2},\cdots,\text{rm}_{i}^{k},\text{rm}^{\prime}_{i}] ,

j = 0 k l ( rm i j , rm i j + 1 ; rm i , θ init ) + l ( rm i , rm i ; rm i , θ init ) = j = 0 k + 1 l ( rm i j , rm i j + 1 ; rm i , θ init ) 0 . superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init superscript subscript 𝑗 0 𝑘 1 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init 0 \sum_{j=0}^{k}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})+l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})=\sum_{j=0}^{k+1}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq 0.

By the arbitrariness of the sequence, we can infer that

V ( rm i , rm i ; rm i , θ init ) + l ( rm i , rm i ; rm i , θ init ) 0 . 𝑉 subscript rm 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 0 V(\text{rm}_{i},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})+l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq 0.

Since l ( rm i , rm i ; rm i , θ init ) 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) is bounded, V ( rm i , rm i ; rm i , θ init ) 𝑉 subscript rm 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init V(\text{rm}_{i},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) is also finite and V ( rm i , rm i ; rm i , θ init ) l ( rm i , rm i ; rm i , θ init ) 𝑉 subscript rm 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init V(\text{rm}_{i},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})\geq-l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) . Then we can establish a payment rule p 𝑝 p such that for any agent i 𝑖 i ,

p i ( ( rm i , rm i ) , θ init ) = V ( rm , rm i ; rm i , θ init ) . subscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑉 superscript rm subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init p_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})=V(\text{rm}^{*},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}).

where rm superscript rm \text{rm}^{*} is defined as follows.

rm ( 𝒙 ) = 1 / | T | 𝒙 T . formulae-sequence superscript rm 𝒙 1 superscript 𝑇 for-all 𝒙 superscript 𝑇 \text{rm}^{*}({\bm{x}})=1/|T^{\ast}|\quad\forall{\bm{x}}\in T^{\ast}. (1)

Then, for any rm i subscript rm 𝑖 \text{rm}_{i} , rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} , rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} and θ init subscript 𝜃 init \theta_{\text{init}} , we have

v i ( ψ ( ( rm i , rm i ) , θ init ) , rm i ) p i ( ( rm i , rm i ) , θ init ) subscript 𝑣 𝑖 𝜓 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript rm 𝑖 subscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle v_{i}({\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}),\text{rm}_{i})-p_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}})
= \displaystyle= v i ( ψ ( ( rm i , rm i ) , θ init ) , rm i ) V ( rm , rm i ; rm i , θ init ) subscript 𝑣 𝑖 𝜓 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript rm 𝑖 𝑉 superscript rm subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle v_{i}({\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}),\text{rm}_{i})-V(\text{rm}^{*},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
= \displaystyle= v i ( ψ ( ( rm i , rm i ) , θ init ) , rm i ) + l ( rm i , rm i ; rm i , θ init ) V ( rm , rm i ; rm i , θ init ) subscript 𝑣 𝑖 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript rm 𝑖 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑉 superscript rm subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}),\text{rm}_{i})+l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})-V(\text{rm}^{*},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
( 2 ) 2 \displaystyle\overset{(2)}{\geq} v i ( ψ ( ( rm i , rm i ) , θ init ) , rm i ) V ( rm , rm i ; rm i , θ init ) subscript 𝑣 𝑖 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript rm 𝑖 𝑉 superscript rm subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}),\text{rm}_{i})-V(\text{rm}^{*},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
= \displaystyle= v i ( ψ ( ( rm i , rm i ) , θ init ) , rm i ) p i ( ( rm i , rm i ) , θ init ) . subscript 𝑣 𝑖 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript rm 𝑖 subscript 𝑝 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}),\text{rm}_{i})-p_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}}).

Note that (2) comes from the definition of V 𝑉 V that:

V ( rm , rm i ; rm i , θ init ) 𝑉 superscript rm subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle V(\text{rm}^{*},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) = inf A finite and distinct sequence [ rm i 0 := rm , rm i 1 , , rm i k , rm i k + 1 := rm i ] j = 0 k l ( rm i j , rm i j + 1 ; rm i , θ init ) absent subscript infimum A finite and distinct sequence delimited-[] formulae-sequence assign superscript subscript rm 𝑖 0 superscript rm superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 𝑘 assign superscript subscript rm 𝑖 𝑘 1 subscript rm 𝑖 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init \displaystyle=\inf_{\begin{subarray}{c}\text{A finite and distinct sequence}\\ [\text{rm}_{i}^{0}:=\text{rm}^{*},\text{rm}_{i}^{1},\cdots,\text{rm}_{i}^{k},\text{rm}_{i}^{k+1}:=\text{rm}_{i}]\end{subarray}}\sum_{j=0}^{k}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
inf A finite and distinct sequence [ rm i 0 := rm , rm i 1 , , rm i k , rm i k + 1 := rm i ] j = 0 k l ( rm i j , rm i j + 1 ; rm i , θ init ) absent subscript infimum A finite and distinct sequence delimited-[] formulae-sequence assign superscript subscript rm 𝑖 0 superscript rm superscript subscript rm 𝑖 1 superscript subscript rm 𝑖 𝑘 assign superscript subscript rm 𝑖 𝑘 1 subscript superscript rm 𝑖 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 subscript rm 𝑖 subscript 𝜃 init \displaystyle\leq\inf_{\begin{subarray}{c}\text{A finite and distinct sequence}\\ [\text{rm}_{i}^{0}:=\text{rm}^{*},\text{rm}_{i}^{1},\cdots,\text{rm}_{i}^{k},\text{rm}_{i}^{k+1}:=\text{rm}^{\prime}_{i}]\end{subarray}}\sum_{j=0}^{k}l(\text{rm}_{i}^{j},\text{rm}_{i}^{j+1};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
+ l ( rm i , rm i ; rm i , θ init ) 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle+l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})
= V ( rm , rm i ; rm i , θ init ) + l ( rm i , rm i ; rm i , θ init ) . absent 𝑉 superscript rm subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init 𝑙 subscript superscript rm 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init \displaystyle=V(\text{rm}^{*},\text{rm}^{\prime}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}})+l(\text{rm}^{\prime}_{i},\text{rm}_{i};\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}).

This means that mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies DSIC, and suffices to show that ψ 𝜓 {\psi} is implementable. ∎

See 3.7

Proof.

We follow the result Theorem 1.37 in Nisan et al. [ 2007 ] .

Lemma A.1 (Theorem 1.37 in Nisan et al. [ 2007 ] ) .

Assume that the 1 , 2 , , n subscript 1 subscript 2 subscript 𝑛 \mathcal{R}_{1},\mathcal{R}_{2},\cdots,\mathcal{R}_{n} are connected sets in the Euclidean space, then all implementable training rules ψ 𝜓 {\psi} satisfy payment equivalence.

Since in our paper, we assume that for all i [ n ] 𝑖 delimited-[] 𝑛 i\in[n] , i subscript 𝑖 \mathcal{R}_{i} is the set of all non-negative and normalized | T | superscript 𝑇 |T^{\ast}| -dim vectors, this is a connected set in the usual metric in the Euclidean space. Therefore, the theorem holds. ∎

Appendix B Omitted proof in Section 4

See 4.2

Proof.

We assume that for group i 𝑖 i , the true reward model is rm i subscript rm 𝑖 \text{rm}_{i} and the agent number is w i subscript 𝑤 𝑖 w_{i} . The reports of other groups are ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i}) and the initial model is θ init subscript 𝜃 init \theta_{\text{init}} .

(1) ( ψ , p A F F ) 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 ({\psi},p^{AFF}) satisfies DSIC.

We compare the utility between reporting ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\text{rm}_{i},w_{i}) and any other ( rm i , w i ) subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 (\text{rm}^{\prime}_{i},w^{\prime}_{i}) . For convenience, we first simplify the notations by letting

θ 𝜃 \displaystyle\theta = ψ ( ( rm i , rm i ) , ( w i , w i ) ) , θ init ) , \displaystyle={\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i})),\theta_{\text{init}}),
θ superscript 𝜃 \displaystyle\theta^{\prime} = ψ ( ( rm i , rm i ) , ( w i , w i ) ) , θ init ) . \displaystyle={\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i})),\theta_{\text{init}}).

The valuation of group i 𝑖 i is the valuation for each agent multiply the real agent number:

v i subscript 𝑣 𝑖 \displaystyle v_{i} = w i v i ( θ ; rm i ) , absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 \displaystyle=w_{i}v_{i}(\theta;\text{rm}_{i}),
v i superscript subscript 𝑣 𝑖 \displaystyle v_{i}^{\prime} = w i v i ( θ ; rm i ) . absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 superscript 𝜃 subscript rm 𝑖 \displaystyle=w_{i}v_{i}(\theta^{\prime};\text{rm}_{i}).

According to the payment rule p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} , the payment p i subscript 𝑝 𝑖 p_{i} for ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\text{rm}_{i},w_{i}) and p i subscript superscript 𝑝 𝑖 p^{\prime}_{i} for ( rm i , w i ) subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 (\text{rm}^{\prime}_{i},w^{\prime}_{i}) is

p i subscript 𝑝 𝑖 \displaystyle p_{i} = A S W i ( rm i , w i , ψ ( rm i , w i , θ init ) ; θ init ) A S W i ( rm i , w i , θ ; θ init ) absent 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜃 subscript 𝜃 init \displaystyle=ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},{\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}});\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta;\theta_{\text{init}})
p i superscript subscript 𝑝 𝑖 \displaystyle p_{i}^{\prime} = A S W i ( rm i , w i , ψ ( rm i , w i , θ init ) ; θ init ) A S W i ( rm i , w i , θ ; θ init ) absent 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 superscript 𝜃 subscript 𝜃 init \displaystyle=ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},{\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}});\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta^{\prime};\theta_{\text{init}})

Therefore, we can calculate the change in the utility:

u i u i = superscript subscript 𝑢 𝑖 subscript 𝑢 𝑖 absent \displaystyle u_{i}^{\prime}-u_{i}= ( v i p i ) ( v i p i ) superscript subscript 𝑣 𝑖 superscript subscript 𝑝 𝑖 subscript 𝑣 𝑖 subscript 𝑝 𝑖 \displaystyle(v_{i}^{\prime}-p_{i}^{\prime})-(v_{i}-p_{i})
= \displaystyle= ( w i v i ( θ ; rm i ) + A S W i ( rm i , w i , θ ; θ init ) ) subscript 𝑤 𝑖 subscript 𝑣 𝑖 superscript 𝜃 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 superscript 𝜃 subscript 𝜃 init \displaystyle\left(w_{i}v_{i}(\theta^{\prime};\text{rm}_{i})+ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta^{\prime};\theta_{\text{init}})\right)
\displaystyle- ( w i v i ( θ ; rm i ) + A S W i ( rm i , w i , θ ; θ init ) ) subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜃 subscript 𝜃 init \displaystyle\left(w_{i}v_{i}(\theta;\text{rm}_{i})+ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta;\theta_{\text{init}})\right)
= \displaystyle= A S W ( ( rm i , rm i ) , ( w i , w i ) ) , θ ; θ init ) A S W ( ( rm i , rm i ) , ( w i , w i ) ) , θ ; θ init ) \displaystyle ASW((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i})),\theta^{\prime};\theta_{\text{init}})-ASW((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i})),\theta;\theta_{\text{init}})
\displaystyle\leq 0 . 0 \displaystyle 0.

The last inequality holds by the definition of θ 𝜃 \theta

θ = ψ ( ( rm i , rm i ) , ( w i , w i ) ) , θ init ) = arg max θ ^ Θ A S W ( ( rm i , rm i ) , ( w i , w i ) ) , θ ^ ; θ init ) . \theta={\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i})),\theta_{\text{init}})=\arg\max_{\hat{\theta}\in\Theta}ASW((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i})),\hat{\theta};\theta_{\text{init}}).

Therefore, we can conclude that, for all rm rm \overrightarrow{\text{rm}} , w 𝑤 \vec{w} and any possible rm i subscript superscript rm 𝑖 \text{rm}^{\prime}_{i} , w i subscript superscript 𝑤 𝑖 w^{\prime}_{i} , we have

u i ( ( rm , w ) ; ψ , p A F F , rm i , w i ) u i ( ( rm i , rm i ) , ( w i , w i ) ) ; ψ , p A F F , rm i , w i ) . \displaystyle u_{i}((\overrightarrow{\text{rm}},\vec{w});{\psi},p^{AFF},\text{rm}_{i},w_{i})\geq u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i}));{\psi},p^{AFF},\text{rm}_{i},w_{i}).

(2) ( ψ , p A F F ) 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 ({\psi},p^{AFF}) satisfies IR.

We reuse the notations above and denote θ i subscript 𝜃 𝑖 \theta_{-i} to be the optimal parameter for groups except for i 𝑖 i , i.e. θ i = ψ ( rm i , w i , θ init ) subscript 𝜃 𝑖 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \theta_{-i}={\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) . When group i 𝑖 i truthfully report its reward model rm i subscript rm 𝑖 \text{rm}_{i} and agent number w i subscript 𝑤 𝑖 w_{i} , the utility can be written as:

u i subscript 𝑢 𝑖 \displaystyle u_{i} = v i p i absent subscript 𝑣 𝑖 subscript 𝑝 𝑖 \displaystyle=v_{i}-p_{i}
= w i v i ( θ ; rm i ) A S W i ( rm i , w i , θ i ; θ init ) + A S W i ( rm i , w i , θ ; θ init ) absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 𝑖 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜃 subscript 𝜃 init \displaystyle=w_{i}v_{i}(\theta;\text{rm}_{i})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{-i};\theta_{\text{init}})+ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta;\theta_{\text{init}})
= w i v i ( θ ; rm i ) + A S W i ( rm i , w i , θ ; θ init ) A S W i ( rm i , w i , θ i ; θ init ) absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝜃 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle=w_{i}v_{i}(\theta;\text{rm}_{i})+ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta;\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{-i};\theta_{\text{init}})
= A S W ( rm , w , θ ; θ init ) A S W i ( rm i , w i , θ i ; θ init ) absent 𝐴 𝑆 𝑊 rm 𝑤 𝜃 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle=ASW(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{-i};\theta_{\text{init}})
A S W ( rm , w , θ i ; θ init ) A S W i ( rm i , w i , θ i ; θ init ) absent 𝐴 𝑆 𝑊 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle\geq ASW(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{-i};\theta_{\text{init}})
= w i v i ( θ i ; rm i ) + A S W i ( rm , w , θ i ; θ init ) A S W i ( rm i , w i , θ i ; θ init ) absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 subscript 𝜃 𝑖 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle=w_{i}v_{i}(\theta_{-i};\text{rm}_{i})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{-i};\theta_{\text{init}})
= w i v i ( θ i ; rm i ) 0 . absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 subscript 𝜃 𝑖 subscript rm 𝑖 0 \displaystyle=w_{i}v_{i}(\theta_{-i};\text{rm}_{i})\geq 0.

Therefore, we can conclude that, for all rm rm \overrightarrow{\text{rm}} , w 𝑤 \vec{w} , we have

u i ( ( rm , w ) ; ψ , p A F F , rm i , w i ) 0 . subscript 𝑢 𝑖 rm 𝑤 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript rm 𝑖 subscript 𝑤 𝑖 0 \displaystyle u_{i}((\overrightarrow{\text{rm}},\vec{w});{\psi},p^{AFF},\text{rm}_{i},w_{i})\geq 0.

See 4.4

Proof.

(1) For D ( p | | q ) = 𝒙 T p ( 𝒙 ) log p ( 𝒙 ) / q ( 𝒙 ) D(p||q)=\sum_{{\bm{x}}\in T^{\ast}}p({\bm{x}})\log p({\bm{x}})/q({\bm{x}}) (KL-divergence), since T superscript 𝑇 T^{\ast} is a finite set, we can rewrite the training rule ψ 𝜓 {\psi} as an optimization problem as follows:

ψ ( rm , w , θ init ) = arg max θ Θ 𝒙 T ( LLM θ ( 𝒙 ) \displaystyle{\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})=\arg\max_{\theta\in\Theta}\sum_{{\bm{x}}\in T^{\ast}}\Bigg{(}\text{LLM}_{\theta}({\bm{x}}) i = 1 n w i rm i ( 𝒙 ) λ LLM θ ( 𝒙 ) log LLM θ ( 𝒙 ) LLM θ init ( 𝒙 ) ) \displaystyle\sum_{i=1}^{n}w_{i}\text{rm}_{i}({\bm{x}})-\lambda\text{LLM}_{\theta}({\bm{x}})\log\frac{\text{LLM}_{\theta}({\bm{x}})}{\text{LLM}_{\theta_{\text{init}}}({\bm{x}})}\Bigg{)}
s.t. 𝒙 T LLM θ ( 𝒙 ) s.t. subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 \displaystyle\text{s.t.}\quad\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}}) = 1 absent 1 \displaystyle=1
LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \displaystyle\text{LLM}_{\theta}({\bm{x}}) 0 𝒙 T . formulae-sequence absent 0 for-all 𝒙 superscript 𝑇 \displaystyle\geq 0\quad\forall{\bm{x}}\in T^{\ast}.

Since we have assumed that the optimal point is unique, and the optimal model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} satisfies that LLM θ ( 𝒙 ) > 0 subscript LLM 𝜃 𝒙 0 \text{LLM}_{\theta}({\bm{x}})>0 , for all 𝒙 T 𝒙 superscript 𝑇 {\bm{x}}\in T^{\ast} . The necessary condition for an optimal θ 𝜃 \theta is that there exists μ 𝜇 \mu\in\mathbb{R} , such that

i = 1 n w i rm i ( 𝒙 ) λ log LLM θ ( 𝒙 ) LLM θ init ( 𝒙 ) λ = μ 𝒙 T . formulae-sequence superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 subscript rm 𝑖 𝒙 𝜆 subscript LLM 𝜃 𝒙 subscript LLM subscript 𝜃 init 𝒙 𝜆 𝜇 for-all 𝒙 superscript 𝑇 \displaystyle\sum_{i=1}^{n}w_{i}\text{rm}_{i}({\bm{x}})-\lambda\log\frac{\text{LLM}_{\theta}({\bm{x}})}{\text{LLM}_{\theta_{\text{init}}}({\bm{x}})}-\lambda=\mu\quad\forall{\bm{x}}\in T^{\ast}.

Similarly, for the input ( rm , w ) superscript rm superscript 𝑤 (\overrightarrow{\text{rm}}^{\prime},\vec{w}^{\prime}) , there exists μ superscript 𝜇 \mu^{\prime}\in\mathbb{R} , such that the optimal θ superscript 𝜃 \theta^{\prime} satisfies

i = 1 n w i rm i ( 𝒙 ) λ log LLM θ ( 𝒙 ) LLM θ init ( 𝒙 ) λ = μ 𝒙 T . formulae-sequence superscript subscript 𝑖 1 𝑛 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝒙 𝜆 subscript LLM superscript 𝜃 𝒙 subscript LLM subscript 𝜃 init 𝒙 𝜆 superscript 𝜇 for-all 𝒙 superscript 𝑇 \displaystyle\sum_{i=1}^{n}w^{\prime}_{i}\text{rm}^{\prime}_{i}({\bm{x}})-\lambda\log\frac{\text{LLM}_{\theta^{\prime}}({\bm{x}})}{\text{LLM}_{\theta_{\text{init}}}({\bm{x}})}-\lambda=\mu^{\prime}\quad\forall{\bm{x}}\in T^{\ast}.

For convenience, we define Δ ( 𝒙 ) = i = 1 n w i rm i ( 𝒙 ) i = 1 n w i rm i ( 𝒙 ) Δ 𝒙 superscript subscript 𝑖 1 𝑛 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝒙 superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 subscript rm 𝑖 𝒙 \Delta({\bm{x}})=\sum_{i=1}^{n}w^{\prime}_{i}\text{rm}^{\prime}_{i}({\bm{x}})-\sum_{i=1}^{n}w_{i}\text{rm}_{i}({\bm{x}}) . Then the relationship between LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \text{LLM}_{\theta}({\bm{x}}) and LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}}) is given by

LLM θ ( 𝒙 ) = LLM θ ( 𝒙 ) e 1 λ ( Δ ( 𝒙 ) + μ μ ) . subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 𝜇 superscript 𝜇 \text{LLM}_{\theta^{\prime}}({\bm{x}})=\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}(\Delta({\bm{x}})+\mu-\mu^{\prime})}.

Note that we also have the condition

𝒙 T LLM θ ( 𝒙 ) = 𝒙 T LLM θ ( 𝒙 ) e 1 λ ( Δ ( 𝒙 ) + μ μ ) = 1 . subscript 𝒙 superscript 𝑇 subscript LLM superscript 𝜃 𝒙 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 𝜇 superscript 𝜇 1 \sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta^{\prime}}({\bm{x}})=\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}(\Delta({\bm{x}})+\mu-\mu^{\prime})}=1.

Since 𝒙 T LLM θ ( 𝒙 ) e 1 λ ( Δ ( 𝒙 ) + μ μ ) = e 1 λ ( μ μ ) 𝒙 T LLM θ ( 𝒙 ) e 1 λ Δ ( 𝒙 ) subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 𝜇 superscript 𝜇 superscript 𝑒 1 𝜆 𝜇 superscript 𝜇 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 \sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}(\Delta({\bm{x}})+\mu-\mu^{\prime})}=e^{\frac{1}{\lambda}(\mu-\mu^{\prime})}\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}\Delta({\bm{x}})} , we can infer that

1 1 \displaystyle 1 = e 1 λ ( μ μ ) 𝒙 T LLM θ ( 𝒙 ) e 1 λ Δ ( 𝒙 ) e 1 λ ( μ μ ) max 𝒙 T e 1 λ Δ ( 𝒙 ) , absent superscript 𝑒 1 𝜆 𝜇 superscript 𝜇 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 superscript 𝑒 1 𝜆 𝜇 superscript 𝜇 subscript 𝒙 superscript 𝑇 superscript 𝑒 1 𝜆 Δ 𝒙 \displaystyle=e^{\frac{1}{\lambda}(\mu-\mu^{\prime})}\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}\Delta({\bm{x}})}\leq e^{\frac{1}{\lambda}(\mu-\mu^{\prime})}\max_{{\bm{x}}\in T^{\ast}}e^{\frac{1}{\lambda}\Delta({\bm{x}})},
1 1 \displaystyle 1 = e 1 λ ( μ μ ) 𝒙 T LLM θ ( 𝒙 ) e 1 λ Δ ( 𝒙 ) e 1 λ ( μ μ ) min 𝒙 T e 1 λ Δ ( 𝒙 ) . absent superscript 𝑒 1 𝜆 𝜇 superscript 𝜇 subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 superscript 𝑒 1 𝜆 Δ 𝒙 superscript 𝑒 1 𝜆 𝜇 superscript 𝜇 subscript 𝒙 superscript 𝑇 superscript 𝑒 1 𝜆 Δ 𝒙 \displaystyle=e^{\frac{1}{\lambda}(\mu-\mu^{\prime})}\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})e^{\frac{1}{\lambda}\Delta({\bm{x}})}\geq e^{\frac{1}{\lambda}(\mu-\mu^{\prime})}\min_{{\bm{x}}\in T^{\ast}}e^{\frac{1}{\lambda}\Delta({\bm{x}})}.

This is equivalent to

min 𝒙 T Δ ( 𝒙 ) μ μ max 𝒙 T Δ ( 𝒙 ) . subscript 𝒙 superscript 𝑇 Δ 𝒙 superscript 𝜇 𝜇 subscript 𝒙 superscript 𝑇 Δ 𝒙 \displaystyle\min_{{\bm{x}}\in T^{\ast}}\Delta({\bm{x}})\leq\mu^{\prime}-\mu\leq\max_{{\bm{x}}\in T^{\ast}}\Delta({\bm{x}}).

Thus, the difference for LLM θ ( 𝒙 ) subscript LLM 𝜃 𝒙 \text{LLM}_{\theta}({\bm{x}}) and LLM θ ( 𝒙 ) subscript LLM superscript 𝜃 𝒙 \text{LLM}_{\theta^{\prime}}({\bm{x}}) can be bounded by

| LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) | subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 \displaystyle|\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})| = | 1 e 1 λ ( Δ ( 𝒙 ) + μ μ ) | LLM θ ( 𝒙 ) absent 1 superscript 𝑒 1 𝜆 Δ 𝒙 𝜇 superscript 𝜇 subscript LLM 𝜃 𝒙 \displaystyle=\left|1-e^{\frac{1}{\lambda}(\Delta({\bm{x}})+\mu-\mu^{\prime})}\right|\text{LLM}_{\theta}({\bm{x}})
| 1 e 1 λ ( Δ ( 𝒙 ) + μ μ ) | absent 1 superscript 𝑒 1 𝜆 Δ 𝒙 𝜇 superscript 𝜇 \displaystyle\leq\left|1-e^{\frac{1}{\lambda}(\Delta({\bm{x}})+\mu-\mu^{\prime})}\right|
max { max 𝒙 T e 2 Δ ( 𝒙 ) λ 1 , max 𝒙 T 1 e 2 Δ ( 𝒙 ) λ } . absent subscript 𝒙 superscript 𝑇 superscript 𝑒 2 Δ 𝒙 𝜆 1 subscript 𝒙 superscript 𝑇 1 superscript 𝑒 2 Δ 𝒙 𝜆 \displaystyle\leq\max\{\max_{{\bm{x}}\in T^{\ast}}e^{\frac{2\Delta({\bm{x}})}{\lambda}}-1,\max_{{\bm{x}}\in T^{\ast}}1-e^{\frac{2\Delta({\bm{x}})}{\lambda}}\}.

For any δ > 0 𝛿 0 \delta>0 , when we set max 𝒙 T | Δ ( 𝒙 ) | min { λ 2 log 1 1 δ , λ 2 log ( 1 + δ ) } subscript 𝒙 superscript 𝑇 Δ 𝒙 𝜆 2 1 1 𝛿 𝜆 2 1 𝛿 \max_{{\bm{x}}\in T^{\ast}}|\Delta({\bm{x}})|\leq\min\{\frac{\lambda}{2}\log\frac{1}{1-\delta},\frac{\lambda}{2}\log(1+\delta)\} , we have

| LLM θ ( 𝒙 ) LLM θ ( 𝒙 ) | max { max 𝒙 T e 2 Δ ( 𝒙 ) λ 1 , max 𝒙 T 1 e 2 Δ ( 𝒙 ) λ } δ . subscript LLM superscript 𝜃 𝒙 subscript LLM 𝜃 𝒙 subscript 𝒙 superscript 𝑇 superscript 𝑒 2 Δ 𝒙 𝜆 1 subscript 𝒙 superscript 𝑇 1 superscript 𝑒 2 Δ 𝒙 𝜆 𝛿 |\text{LLM}_{\theta^{\prime}}({\bm{x}})-\text{LLM}_{\theta}({\bm{x}})|\leq\max\{\max_{{\bm{x}}\in T^{\ast}}e^{\frac{2\Delta({\bm{x}})}{\lambda}}-1,\max_{{\bm{x}}\in T^{\ast}}1-e^{\frac{2\Delta({\bm{x}})}{\lambda}}\}\leq\delta.

(2) For D ( p | | q ) = 𝒙 T ( p ( 𝒙 ) q ( 𝒙 ) ) 2 D(p||q)=\sum_{{\bm{x}}\in T^{\ast}}(p({\bm{x}})-q({\bm{x}}))^{2} ( L 2 subscript 𝐿 2 L_{2} distance), since T superscript 𝑇 T^{\ast} is a finite set, we can rewrite the training rule ψ 𝜓 {\psi} as an optimization problem as follows:

ψ ( rm , w , θ init ) = arg max θ Θ x T ( LLM θ ( x ) \displaystyle{\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})=\arg\max_{\theta\in\Theta}\sum_{x\in T^{\ast}}\Bigg{(}\text{LLM}_{\theta}(x) i = 1 n w i rm i ( x ) λ ( LLM θ ( x ) LLM θ init ( x ) ) 2 ) \displaystyle\sum_{i=1}^{n}w_{i}\text{rm}_{i}(x)-\lambda(\text{LLM}_{\theta}(x)-\text{LLM}_{\theta_{\text{init}}}(x))^{2}\Bigg{)}
s.t. x T LLM θ ( x ) s.t. subscript 𝑥 superscript 𝑇 subscript LLM 𝜃 𝑥 \displaystyle\text{s.t.}\quad\sum_{x\in T^{\ast}}\text{LLM}_{\theta}(x) = 1 absent 1 \displaystyle=1
LLM θ ( x ) subscript LLM 𝜃 𝑥 \displaystyle\text{LLM}_{\theta}(x) 0 x T . formulae-sequence absent 0 for-all 𝑥 superscript 𝑇 \displaystyle\geq 0\quad\forall x\in T^{\ast}.

Since we have assumed that the optimal point is unique, and the optimal model LLM θ subscript LLM 𝜃 \text{LLM}_{\theta} satisfies that LLM θ ( x ) > 0 subscript LLM 𝜃 𝑥 0 \text{LLM}_{\theta}(x)>0 , for all x T 𝑥 superscript 𝑇 x\in T^{\ast} . The necessary condition for an optimal θ 𝜃 \theta is that there exists μ 𝜇 \mu\in\mathbb{R} , such that

i = 1 n w i rm i ( x ) 2 λ ( LLM θ ( x ) LLM θ init ( x ) ) = μ x T . formulae-sequence superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 subscript rm 𝑖 𝑥 2 𝜆 subscript LLM 𝜃 𝑥 subscript LLM subscript 𝜃 init 𝑥 𝜇 for-all 𝑥 superscript 𝑇 \sum_{i=1}^{n}w_{i}\text{rm}_{i}(x)-2\lambda(\text{LLM}_{\theta}(x)-\text{LLM}_{\theta_{\text{init}}}(x))=\mu\quad\forall x\in T^{\ast}.

Similarly, for the input ( rm , w ) superscript rm superscript 𝑤 (\overrightarrow{\text{rm}}^{\prime},\vec{w}^{\prime}) , there exists μ superscript 𝜇 \mu^{\prime}\in\mathbb{R} , such that the optimal θ superscript 𝜃 \theta^{\prime} satisfies

i = 1 n w i rm i ( x ) 2 λ ( LLM θ ( x ) LLM θ init ( x ) ) = μ x T . formulae-sequence superscript subscript 𝑖 1 𝑛 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝑥 2 𝜆 subscript LLM superscript 𝜃 𝑥 subscript LLM subscript 𝜃 init 𝑥 superscript 𝜇 for-all 𝑥 superscript 𝑇 \sum_{i=1}^{n}w^{\prime}_{i}\text{rm}^{\prime}_{i}(x)-2\lambda(\text{LLM}_{\theta^{\prime}}(x)-\text{LLM}_{\theta_{\text{init}}}(x))=\mu^{\prime}\quad\forall x\in T^{\ast}.

For convenience, we define Δ ( x ) = i = 1 n w i rm i ( x ) i = 1 n w i rm i ( x ) Δ 𝑥 superscript subscript 𝑖 1 𝑛 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝑥 superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 subscript rm 𝑖 𝑥 \Delta(x)=\sum_{i=1}^{n}w^{\prime}_{i}\text{rm}^{\prime}_{i}(x)-\sum_{i=1}^{n}w_{i}\text{rm}_{i}(x) Then the relationship between LLM θ ( x ) subscript LLM 𝜃 𝑥 \text{LLM}_{\theta}(x) and LLM θ ( x ) subscript LLM superscript 𝜃 𝑥 \text{LLM}_{\theta^{\prime}}(x) is given by

LLM θ ( x ) = LLM θ ( x ) + 1 2 λ ( Δ ( x ) + μ μ ) . subscript LLM superscript 𝜃 𝑥 subscript LLM 𝜃 𝑥 1 2 𝜆 Δ 𝑥 𝜇 superscript 𝜇 \text{LLM}_{\theta^{\prime}}(x)=\text{LLM}_{\theta}(x)+\frac{1}{2\lambda}(\Delta(x)+\mu-\mu^{\prime}).

Note that we also have the condition

x T LLM θ ( x ) = x T LLM θ ( x ) + 1 2 λ ( Δ ( x ) + μ μ ) = 1 . subscript 𝑥 superscript 𝑇 subscript LLM superscript 𝜃 𝑥 subscript 𝑥 superscript 𝑇 subscript LLM 𝜃 𝑥 1 2 𝜆 Δ 𝑥 𝜇 superscript 𝜇 1 \sum_{x\in T^{\ast}}\text{LLM}_{\theta^{\prime}}(x)=\sum_{x\in T^{\ast}}\text{LLM}_{\theta}(x)+\frac{1}{2\lambda}(\Delta(x)+\mu-\mu^{\prime})=1.

Since x T LLM θ ( x ) = 1 subscript 𝑥 superscript 𝑇 subscript LLM 𝜃 𝑥 1 \sum_{x\in T^{\ast}}\text{LLM}_{\theta}(x)=1 , we can infer that

x T 1 2 λ ( Δ ( x ) + μ μ ) = 0 . subscript 𝑥 superscript 𝑇 1 2 𝜆 Δ 𝑥 𝜇 superscript 𝜇 0 \displaystyle\sum_{x\in T^{\ast}}\frac{1}{2\lambda}(\Delta(x)+\mu-\mu^{\prime})=0.

This is equivalent to

μ μ = 1 | T | x T Δ ( x ) . superscript 𝜇 𝜇 1 superscript 𝑇 subscript 𝑥 superscript 𝑇 Δ 𝑥 \displaystyle\mu^{\prime}-\mu=\frac{1}{|T^{\ast}|}\sum_{x\in T^{\ast}}\Delta(x).

Thus, the difference for LLM θ ( x ) subscript LLM 𝜃 𝑥 \text{LLM}_{\theta}(x) and LLM θ ( x ) subscript LLM superscript 𝜃 𝑥 \text{LLM}_{\theta^{\prime}}(x) can be bounded by

| LLM θ ( x ) LLM θ ( x ) | = | 1 2 λ ( Δ ( x ) + μ μ ) | 1 λ max x T | Δ ( x ) | subscript LLM superscript 𝜃 𝑥 subscript LLM 𝜃 𝑥 1 2 𝜆 Δ 𝑥 𝜇 superscript 𝜇 1 𝜆 subscript 𝑥 superscript 𝑇 Δ 𝑥 \displaystyle|\text{LLM}_{\theta^{\prime}}(x)-\text{LLM}_{\theta}(x)|=\left|\frac{1}{2\lambda}(\Delta(x)+\mu-\mu^{\prime})\right|\leq\frac{1}{\lambda}\max_{x\in T^{\ast}}|\Delta(x)|

For any δ > 0 𝛿 0 \delta>0 , when we set max x T | Δ ( x ) | λ δ subscript 𝑥 superscript 𝑇 Δ 𝑥 𝜆 𝛿 \max_{x\in T^{\ast}}|\Delta(x)|\leq\lambda\delta , we have

| LLM θ ( x ) LLM θ ( x ) | 1 λ max x T | Δ ( x ) | δ . subscript LLM superscript 𝜃 𝑥 subscript LLM 𝜃 𝑥 1 𝜆 subscript 𝑥 superscript 𝑇 Δ 𝑥 𝛿 |\text{LLM}_{\theta^{\prime}}(x)-\text{LLM}_{\theta}(x)|\leq\frac{1}{\lambda}\max_{x\in T^{\ast}}|\Delta(x)|\leq\delta.

See 4.5

Proof.

We prove the equivalent version of payment equivalence: For any group i 𝑖 i , when fixing other groups reports ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i}) and θ init subscript 𝜃 init \theta_{\text{init}} , any two payment rules p 𝑝 p , p superscript 𝑝 p^{\prime} that implement ψ 𝜓 {\psi} in DSIC must satisfy that there exists a constant c 𝑐 c , such that p i ( rm i , w i ) p i ( rm i , w i ) = c subscript 𝑝 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript superscript 𝑝 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 𝑐 p_{i}(\text{rm}_{i},w_{i})-p^{\prime}_{i}(\text{rm}_{i},w_{i})=c for any rm i subscript rm 𝑖 \text{rm}_{i} and w i subscript 𝑤 𝑖 w_{i} . Therefore, in the rest of the proof, we suppose fixed ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i}) and θ init subscript 𝜃 init \theta_{\text{init}} and will omit these notations.

We first redefine the functions l ( , ) 𝑙 l(\cdot,\cdot) and V ( , ) 𝑉 V(\cdot,\cdot) which have been introduced in the previous section. The function l 𝑙 l and V 𝑉 V are defined on the types space of the group. For the simplified case in Section 3 , the type is exactly the reward model rm i subscript rm 𝑖 \text{rm}_{i} . But when we consider both rm i subscript rm 𝑖 \text{rm}_{i} and w i subscript 𝑤 𝑖 w_{i} , the type should contain both information, so we use t i subscript 𝑡 𝑖 t_{i} to represent the combination ( rm i , w i ) subscript rm 𝑖 subscript 𝑤 𝑖 (\text{rm}_{i},w_{i}) . And the domain of t i subscript 𝑡 𝑖 t_{i} is i × 𝒲 i subscript 𝑖 subscript 𝒲 𝑖 \mathcal{R}_{i}\times\mathcal{W}_{i} . Note that, without specially claim, t i subscript 𝑡 𝑖 t_{i} is used to represented for the rm i subscript rm 𝑖 \text{rm}_{i} and w i subscript 𝑤 𝑖 w_{i} with the same superscript and subscript, for example, t i k = ( rm i k , w i k ) superscript subscript 𝑡 𝑖 𝑘 superscript subscript rm 𝑖 𝑘 superscript subscript 𝑤 𝑖 𝑘 t_{i}^{k}=(\text{rm}_{i}^{k},w_{i}^{k}) .

Similar to the simplified version discussed in Section 3 , we let l ( t i , t i ) 𝑙 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 l(t^{\prime}_{i},t_{i}) be the change in valuation from misreport type t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} to truthfully report type t i subscript 𝑡 𝑖 t_{i} . In formal,

l ( t i , t i ) := w i v i ( ψ ( t i ) ; rm i ) w i v i ( ψ ( t i ) ; rm i ) . assign 𝑙 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜓 subscript 𝑡 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑖 subscript rm 𝑖 l(t^{\prime}_{i},t_{i}):=w_{i}v_{i}({\psi}(t_{i});\text{rm}_{i})-w_{i}v_{i}({\psi}(t^{\prime}_{i});\text{rm}_{i}).

And V ( t i , t i ) 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t^{\prime}_{i},t_{i}) refers to the smallest values of l 𝑙 l on a finite and distinct path from t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} to t i subscript 𝑡 𝑖 t_{i}

V ( t i , t i ) := inf A finite and distinct sequence [ t i 0 := t i , t i 1 , , t i k , t i k + 1 := t i ] j = 0 k l ( t i j , t i j + 1 ) . assign 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 subscript infimum A finite and distinct sequence delimited-[] formulae-sequence assign superscript subscript 𝑡 𝑖 0 subscript superscript 𝑡 𝑖 superscript subscript 𝑡 𝑖 1 superscript subscript 𝑡 𝑖 𝑘 assign superscript subscript 𝑡 𝑖 𝑘 1 subscript 𝑡 𝑖 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 V(t^{\prime}_{i},t_{i}):=\inf_{\begin{subarray}{c}\text{A finite and distinct sequence}\\ [t_{i}^{0}:=t^{\prime}_{i},t_{i}^{1},\cdots,t_{i}^{k},t_{i}^{k+1}:=t_{i}]\end{subarray}}\sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1}).

We first prove the following lemma, which is a special case in Heydenreich et al. [ 2009 ] ,

Lemma B.1 .

An implemented training rule ψ 𝜓 {\psi} satisfies payment equivalence if for any agent i 𝑖 i , and any types t i subscript 𝑡 𝑖 t_{i} , t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} , we have

V ( t i , t i ) = V ( t i , t i ) . 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})=-V(t^{\prime}_{i},t_{i}).
Proof.

Assume there is a mechanism ( ψ , p ) 𝜓 𝑝 ({\psi},p) satisfies DSIC. For any two types t i subscript 𝑡 𝑖 t_{i} , t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} and a finite and distinct sequence [ t i , t i 1 , , t i k , t i ] subscript superscript 𝑡 𝑖 superscript subscript 𝑡 𝑖 1 superscript subscript 𝑡 𝑖 𝑘 subscript 𝑡 𝑖 [t^{\prime}_{i},t_{i}^{1},\cdots,t_{i}^{k},t_{i}] , let t i 0 = t i superscript subscript 𝑡 𝑖 0 subscript superscript 𝑡 𝑖 t_{i}^{0}=t^{\prime}_{i} and t i k + 1 = t i superscript subscript 𝑡 𝑖 𝑘 1 subscript 𝑡 𝑖 t_{i}^{k+1}=t_{i} , we have that

w i j + 1 v i ( ψ ( t i j + 1 ) , rm i j + 1 ) p i ( t i j + 1 ) w i j + 1 v i ( ψ ( t i j ) , rm i j + 1 ) p i ( t i j ) 0 j k . formulae-sequence subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 𝑖 for-all 0 𝑗 𝑘 \displaystyle w^{j+1}_{i}v_{i}({\psi}(t^{j+1}_{i}),\text{rm}^{j+1}_{i})-p_{i}(t^{j+1}_{i})\geq w^{j+1}_{i}v_{i}({\psi}(t^{j}_{i}),\text{rm}^{j+1}_{i})-p_{i}(t^{j}_{i})\quad\forall 0\leq j\leq k.

This can be rewritten as

w i j + 1 v i ( ψ ( t i j + 1 ) , rm i j + 1 ) w i j + 1 v i ( ψ ( t i j ) , rm i j + 1 ) p i ( t i j + 1 ) p i ( t i j ) 0 j k . formulae-sequence subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 1 𝑖 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 𝑖 for-all 0 𝑗 𝑘 \displaystyle w^{j+1}_{i}v_{i}({\psi}(t^{j+1}_{i}),\text{rm}^{j+1}_{i})-w^{j+1}_{i}v_{i}({\psi}(t^{j}_{i}),\text{rm}^{j+1}_{i})\geq p_{i}(t^{j+1}_{i})-p_{i}(t^{j}_{i})\quad\forall 0\leq j\leq k.

Sum over j 𝑗 j , we get the following inequality

j = 0 k l ( t i j , t i j + 1 ) superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 \displaystyle\sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1}) = j = 0 k w i j + 1 v i ( ψ ( t i j + 1 ) , rm i j + 1 ) w i j + 1 v i ( ψ ( t i j ) , rm i j + 1 ) absent superscript subscript 𝑗 0 𝑘 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 𝑖 subscript superscript rm 𝑗 1 𝑖 \displaystyle=\sum_{j=0}^{k}w^{j+1}_{i}v_{i}({\psi}(t^{j+1}_{i}),\text{rm}^{j+1}_{i})-w^{j+1}_{i}v_{i}({\psi}(t^{j}_{i}),\text{rm}^{j+1}_{i})
j = 0 k p i ( t i j + 1 ) p i ( t i j ) = p ( t i ) p ( t i ) . absent superscript subscript 𝑗 0 𝑘 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 1 𝑖 subscript 𝑝 𝑖 subscript superscript 𝑡 𝑗 𝑖 𝑝 subscript 𝑡 𝑖 𝑝 subscript superscript 𝑡 𝑖 \displaystyle\geq\sum_{j=0}^{k}p_{i}(t^{j+1}_{i})-p_{i}(t^{j}_{i})=p(t_{i})-p(t^{\prime}_{i}).

Since this holds for arbitrary finite and distinct sequences, we can infer that V ( t i , t i ) p ( t i ) p ( t i ) 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 𝑝 subscript 𝑡 𝑖 𝑝 subscript superscript 𝑡 𝑖 V(t^{\prime}_{i},t_{i})\geq p(t_{i})-p(t^{\prime}_{i}) . Similarly, there is V ( t i , t i ) p ( t i ) p ( t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑝 subscript superscript 𝑡 𝑖 𝑝 subscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})\geq p(t^{\prime}_{i})-p(t_{i}) . Combining these results with V ( t i , t i ) = V ( t i , t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})=-V(t^{\prime}_{i},t_{i}) , there is

V ( t i , t i ) = V ( t i , t i ) p ( t i ) p ( t i ) V ( t i , t i ) , 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 𝑝 subscript superscript 𝑡 𝑖 𝑝 subscript 𝑡 𝑖 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})=-V(t^{\prime}_{i},t_{i})\leq p(t^{\prime}_{i})-p(t_{i})\leq V(t_{i},t^{\prime}_{i}),

which means that p ( t i ) p ( t i ) = V ( t i , t i ) 𝑝 subscript superscript 𝑡 𝑖 𝑝 subscript 𝑡 𝑖 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 p(t^{\prime}_{i})-p(t_{i})=V(t_{i},t^{\prime}_{i}) . Note that this holds for arbitrary t i subscript 𝑡 𝑖 t_{i} and t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} . Therefore, when for some t i subscript 𝑡 𝑖 t_{i} , the payment p ( t i ) 𝑝 subscript 𝑡 𝑖 p(t_{i}) is determined, then the payment for all other t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} s are determined. For example, if there are any two payment rules p 𝑝 p and p superscript 𝑝 p^{\prime} both implement ψ 𝜓 {\psi} in DSIC, and we set the payment when i 𝑖 i reports uniform reward model rm superscript rm \text{rm}^{*} defined in Equation 1 and w i = 1 subscript 𝑤 𝑖 1 w_{i}=1 as p superscript 𝑝 p^{*} and p superscript 𝑝 p^{\prime*} respectively, then t i for-all subscript 𝑡 𝑖 \forall t_{i}

p i ( t i ) p i ( t i ) subscript 𝑝 𝑖 subscript 𝑡 𝑖 subscript superscript 𝑝 𝑖 subscript 𝑡 𝑖 \displaystyle p_{i}(t_{i})-p^{\prime}_{i}(t_{i})
= \displaystyle= ( p i ( t i ) p ) ( p i ( t i ) p ) + p p subscript 𝑝 𝑖 subscript 𝑡 𝑖 superscript 𝑝 subscript superscript 𝑝 𝑖 subscript 𝑡 𝑖 superscript 𝑝 superscript 𝑝 superscript 𝑝 \displaystyle\left(p_{i}(t_{i})-p^{*}\right)-\left(p^{\prime}_{i}(t_{i})-p^{\prime*}\right)+p^{*}-p^{\prime*}
= \displaystyle= V ( ( rm , 1 ) , t i ) V ( ( rm , 1 ) , t i ) + p p 𝑉 superscript rm 1 subscript 𝑡 𝑖 𝑉 superscript rm 1 subscript 𝑡 𝑖 superscript 𝑝 superscript 𝑝 \displaystyle V((\text{rm}^{*},1),t_{i})-V((\text{rm}^{*},1),t_{i})+p^{*}-p^{\prime*}
= \displaystyle= p p . superscript 𝑝 superscript 𝑝 \displaystyle p^{*}-p^{\prime*}.

Note that p superscript 𝑝 p^{*} and p superscript 𝑝 p^{\prime*} are not influenced by i 𝑖 i ’s report, but they may vary for different rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , w i subscript 𝑤 𝑖 \vec{w}_{-i} and θ init subscript 𝜃 init \theta_{\text{init}} , which means that we can consider the term p p superscript 𝑝 superscript 𝑝 p^{*}-p^{\prime*} as a function f 𝑓 f on ( rm i , θ init ) subscript rm 𝑖 subscript 𝜃 init (\overrightarrow{\text{rm}}_{-i},\theta_{\text{init}}) . ∎

Then we show that under 4.3 , SW-Maximizing training rule satisfies the condition stated in Lemma B.1 . Firstly, we show that for any t i , t i subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 t_{i},t^{\prime}_{i} , we have V ( t i , t i ) + V ( t i , t i ) 0 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 0 V(t_{i},t^{\prime}_{i})+V(t^{\prime}_{i},t_{i})\geq 0 . By definition of the function V ( , ) 𝑉 V(\cdot,\cdot) , V ( t i , t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i}) and V ( t i , t i ) 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t^{\prime}_{i},t_{i}) refer to the shortest path from t i subscript 𝑡 𝑖 t_{i} to t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} and from t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} to t i subscript 𝑡 𝑖 t_{i} respectively, which means that V ( t i , t i ) + V ( t i , t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})+V(t^{\prime}_{i},t_{i}) is the shortest weight for a cycle that goes through t i subscript 𝑡 𝑖 t_{i} and t i subscript superscript 𝑡 𝑖 t^{\prime}_{i} . Since SW-Maximizing training rule is implementable, by Theorem 3.5 , we know that the weight for any cycle is non-negative. Therefore, V ( t i , t i ) + V ( t i , t i ) 0 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 0 V(t_{i},t^{\prime}_{i})+V(t^{\prime}_{i},t_{i})\geq 0 must be satisfied.

Then we show that for any t i , t i subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 t_{i},t^{\prime}_{i} and ϵ > 0 italic-ϵ 0 \epsilon>0 , V ( t i , t i ) + V ( t i , t i ) ϵ 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 italic-ϵ V(t_{i},t^{\prime}_{i})+V(t^{\prime}_{i},t_{i})\leq\epsilon . We prove this by constructing a finite and distinct sequence [ t i , t i 1 , , t i k , t i ] subscript 𝑡 𝑖 superscript subscript 𝑡 𝑖 1 superscript subscript 𝑡 𝑖 𝑘 subscript superscript 𝑡 𝑖 [t_{i},t_{i}^{1},\cdots,t_{i}^{k},t^{\prime}_{i}] such that

j = 0 k l ( t i j , t i j + 1 ) + j = 0 k l ( t i j + 1 , t i j ) ϵ . superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑡 𝑖 𝑗 italic-ϵ \sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1})+\sum_{j=0}^{k}l(t_{i}^{j+1},t_{i}^{j})\leq\epsilon. (2)

This is suffice for V ( t i , t i ) + V ( t i , t i ) ϵ 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 italic-ϵ V(t_{i},t^{\prime}_{i})+V(t^{\prime}_{i},t_{i})\leq\epsilon since V ( t i , t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i}) and V ( t i , t i ) 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t^{\prime}_{i},t_{i}) are the lower bound for j = 0 k l ( t i j , t i j + 1 ) superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 \sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1}) and j = 0 k l ( t i j + 1 , t i j ) superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑡 𝑖 𝑗 \sum_{j=0}^{k}l(t_{i}^{j+1},t_{i}^{j}) respectively.

Initially, we rewrite the LHS of Equation 2 by the definition of the function l ( , ) 𝑙 l(\cdot,\cdot) .

j = 0 k l ( t i j , t i j + 1 ) + j = 0 k l ( t i j + 1 , t i j ) superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑡 𝑖 𝑗 \displaystyle\sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1})+\sum_{j=0}^{k}l(t_{i}^{j+1},t_{i}^{j})
= \displaystyle= j = 1 k ( w i j + 1 v i ( ψ ( t i j + 1 ) , rm i j + 1 ) w i j + 1 v i ( ψ ( t i j ) , rm i j + 1 ) ) + j = 0 k ( w i j v i ( ψ ( t i j ) , rm i j ) w i j v i ( ψ ( t i j + 1 ) , rm i j ) ) superscript subscript 𝑗 1 𝑘 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript rm 𝑗 1 𝑖 subscript superscript 𝑤 𝑗 1 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 𝑖 subscript superscript rm 𝑗 1 𝑖 superscript subscript 𝑗 0 𝑘 subscript superscript 𝑤 𝑗 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 𝑖 subscript superscript rm 𝑗 𝑖 subscript superscript 𝑤 𝑗 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript 𝑡 𝑗 1 𝑖 subscript superscript rm 𝑗 𝑖 \displaystyle\sum_{j=1}^{k}\left(w^{j+1}_{i}v_{i}({\psi}(t^{j+1}_{i}),\text{rm}^{j+1}_{i})-w^{j+1}_{i}v_{i}({\psi}(t^{j}_{i}),\text{rm}^{j+1}_{i})\right)+\sum_{j=0}^{k}\left(w^{j}_{i}v_{i}({\psi}(t^{j}_{i}),\text{rm}^{j}_{i})-w^{j}_{i}v_{i}({\psi}(t^{j+1}_{i}),\text{rm}^{j}_{i})\right)
= \displaystyle= j = 0 k w i j + 1 ( LLM θ j + 1 LLM θ j ) rm i j + 1 + j = 0 k w i j ( LLM θ j LLM θ j + 1 ) rm i j superscript subscript 𝑗 0 𝑘 superscript subscript 𝑤 𝑖 𝑗 1 subscript LLM superscript 𝜃 𝑗 1 subscript LLM superscript 𝜃 𝑗 superscript subscript rm 𝑖 𝑗 1 superscript subscript 𝑗 0 𝑘 superscript subscript 𝑤 𝑖 𝑗 subscript LLM superscript 𝜃 𝑗 subscript LLM superscript 𝜃 𝑗 1 superscript subscript rm 𝑖 𝑗 \displaystyle\sum_{j=0}^{k}w_{i}^{j+1}(\text{LLM}_{\theta^{j+1}}-\text{LLM}_{\theta^{j}})\cdot\text{rm}_{i}^{j+1}+\sum_{j=0}^{k}w_{i}^{j}(\text{LLM}_{\theta^{j}}-\text{LLM}_{\theta^{j+1}})\cdot\text{rm}_{i}^{j}
= \displaystyle= j = 0 k ( LLM θ j + 1 LLM θ j ) ( w i j + 1 rm i j + 1 w i j rm i j ) superscript subscript 𝑗 0 𝑘 subscript LLM superscript 𝜃 𝑗 1 subscript LLM superscript 𝜃 𝑗 superscript subscript 𝑤 𝑖 𝑗 1 superscript subscript rm 𝑖 𝑗 1 superscript subscript 𝑤 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 \displaystyle\sum_{j=0}^{k}(\text{LLM}_{\theta^{j+1}}-\text{LLM}_{\theta^{j}})\cdot(w_{i}^{j+1}\text{rm}_{i}^{j+1}-w_{i}^{j}\text{rm}_{i}^{j})
= \displaystyle= j = 0 k x T ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( w i j + 1 rm i j + 1 ( x ) w i j rm i j ( x ) ) . superscript subscript 𝑗 0 𝑘 subscript 𝑥 superscript 𝑇 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript 𝑤 𝑖 𝑗 1 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript 𝑤 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=0}^{k}\sum_{x\in T^{\ast}}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(w_{i}^{j+1}\text{rm}_{i}^{j+1}(x)-w_{i}^{j}\text{rm}_{i}^{j}(x)).

In the above equations, θ j = ψ ( t i j ) superscript 𝜃 𝑗 𝜓 superscript subscript 𝑡 𝑖 𝑗 \theta^{j}={\psi}(t_{i}^{j}) for 0 j k 0 𝑗 𝑘 0\leq j\leq k .

By 4.3 , when rm i subscript rm 𝑖 \overrightarrow{\text{rm}}_{-i} , w i subscript 𝑤 𝑖 \vec{w}_{-i} and θ init subscript 𝜃 init \theta_{\text{init}} are fixed, there exits δ > 0 𝛿 0 \delta>0 such that if max x T | w i rm i ( x ) w i rm i ( x ) | δ subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript rm 𝑖 𝑥 subscript superscript 𝑤 𝑖 subscript superscript rm 𝑖 𝑥 𝛿 \max_{x\in T^{\ast}}|w_{i}\text{rm}_{i}(x)-w^{\prime}_{i}\text{rm}^{\prime}_{i}(x)|\leq\delta , then max x T | LLM θ ( x ) LLM θ ( x ) | ϵ 4 w ¯ subscript 𝑥 superscript 𝑇 subscript LLM 𝜃 𝑥 subscript LLM superscript 𝜃 𝑥 italic-ϵ 4 ¯ 𝑤 \max_{x\in T^{\ast}}|\text{LLM}_{\theta}(x)-\text{LLM}_{\theta^{\prime}}(x)|\leq\frac{\epsilon}{4\bar{w}} , where θ := ψ ( ( rm i , rm i ) , ( w i , w i ) ; θ init ) assign 𝜃 𝜓 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \theta:={\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});\theta_{\text{init}}) and θ := ψ ( ( rm i , rm i ) , ( w i , w i ) ; θ init ) assign superscript 𝜃 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \theta^{\prime}:={\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}}) .

We construct the sequence P 𝑃 P as follows: we set k = 2 n 𝑘 2 𝑛 k=2n , n w ¯ δ + 1 𝑛 ¯ 𝑤 𝛿 1 n\geq\frac{\bar{w}}{\delta}+1 and let t i 0 = t i , t i k + 1 = t i formulae-sequence superscript subscript 𝑡 𝑖 0 subscript 𝑡 𝑖 superscript subscript 𝑡 𝑖 𝑘 1 subscript superscript 𝑡 𝑖 t_{i}^{0}=t_{i},t_{i}^{k+1}=t^{\prime}_{i} . For each 0 j n 0 𝑗 𝑛 0\leq j\leq n ,

w i j = w i 0 = w i , rm i j = rm i j 1 + j ( rm rm n ) . formulae-sequence superscript subscript 𝑤 𝑖 𝑗 superscript subscript 𝑤 𝑖 0 subscript 𝑤 𝑖 superscript subscript rm 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 1 𝑗 superscript rm rm 𝑛 w_{i}^{j}=w_{i}^{0}=w_{i},\quad\text{rm}_{i}^{j}=\text{rm}_{i}^{j-1}+j(\frac{\text{rm}^{*}-\text{rm}}{n}).

And for each n + 1 j 2 n + 1 𝑛 1 𝑗 2 𝑛 1 n+1\leq j\leq 2n+1 ,

w i j = w i 2 n + 1 = w i , rm i j = rm + ( j n 1 ) ( rm rm n ) . formulae-sequence superscript subscript 𝑤 𝑖 𝑗 superscript subscript 𝑤 𝑖 2 𝑛 1 subscript superscript 𝑤 𝑖 superscript subscript rm 𝑖 𝑗 superscript rm 𝑗 𝑛 1 superscript rm superscript rm 𝑛 w_{i}^{j}=w_{i}^{2n+1}=w^{\prime}_{i},\quad\text{rm}_{i}^{j}=\text{rm}^{*}+(j-n-1)(\frac{\text{rm}^{\prime}-\text{rm}^{*}}{n}).

In this construction, any rm i j superscript subscript rm 𝑖 𝑗 \text{rm}_{i}^{j} is either an weighted average of rm and rm superscript rm \text{rm}^{*} or rm superscript rm \text{rm}^{\prime} and rm superscript rm \text{rm}^{*} . This ensures that the all reward models in the sequence is valid (normalized and non-negative). We can then divide the above equation into three parts, making the w i subscript 𝑤 𝑖 w_{i} the same in the first and the last parts.

j = 0 k x T ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( w i j + 1 rm i j + 1 ( x ) w i j rm i j ( x ) ) superscript subscript 𝑗 0 𝑘 subscript 𝑥 superscript 𝑇 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript 𝑤 𝑖 𝑗 1 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript 𝑤 𝑖 𝑗 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=0}^{k}\sum_{x\in T^{\ast}}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(w_{i}^{j+1}\text{rm}_{i}^{j+1}(x)-w_{i}^{j}\text{rm}_{i}^{j}(x))
= \displaystyle= j = 0 n 1 x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( rm i j + 1 ( x ) rm i j ( x ) ) superscript subscript 𝑗 0 𝑛 1 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=0}^{n-1}\sum_{x\in T^{\ast}}w_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(\text{rm}_{i}^{j+1}(x)-\text{rm}_{i}^{j}(x)) (a)
+ \displaystyle+ x T ( LLM θ n + 1 ( x ) LLM θ n ( x ) ) ( w i rm i n + 1 ( x ) w i rm i n ( x ) ) subscript 𝑥 superscript 𝑇 subscript LLM superscript 𝜃 𝑛 1 𝑥 subscript LLM superscript 𝜃 𝑛 𝑥 subscript superscript 𝑤 𝑖 superscript subscript rm 𝑖 𝑛 1 𝑥 subscript 𝑤 𝑖 superscript subscript rm 𝑖 𝑛 𝑥 \displaystyle\sum_{x\in T^{\ast}}(\text{LLM}_{\theta^{n+1}}(x)-\text{LLM}_{\theta^{n}}(x))(w^{\prime}_{i}\text{rm}_{i}^{n+1}(x)-w_{i}\text{rm}_{i}^{n}(x)) (b)
+ \displaystyle+ j = n + 1 2 n x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( rm i j + 1 ( x ) rm i j ( x ) ) superscript subscript 𝑗 𝑛 1 2 𝑛 subscript 𝑥 superscript 𝑇 subscript superscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=n+1}^{2n}\sum_{x\in T^{\ast}}w^{\prime}_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(\text{rm}_{i}^{j+1}(x)-\text{rm}_{i}^{j}(x)) (c)

We first show that (b) equals to 0 0 by proving θ n = ψ ( ( rm , rm i ) , ( w i , w i ) ; θ init ) = ψ ( ( rm , rm i ) , ( w i , w i ) ; θ init ) = θ n + 1 superscript 𝜃 𝑛 𝜓 superscript rm subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init 𝜓 superscript rm subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init superscript 𝜃 𝑛 1 \theta^{n}={\psi}((\text{rm}^{*},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i});\theta_{\text{init}})={\psi}((\text{rm}^{*},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}})=\theta^{n+1} . By contradiction, if θ n ψ ( ( rm , rm i ) , ( w i , w i ) ; θ init ) superscript 𝜃 𝑛 𝜓 superscript rm subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \theta^{n}\neq{\psi}((\text{rm}^{*},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}}) and the uniqueness of the optimal point, we have that

x T ( w i rm ( x ) + j i w i rm j ( x ) ) LLM θ n + 1 ( x ) + D ( LLM θ n + 1 , LLM θ init ) subscript 𝑥 superscript 𝑇 subscript superscript 𝑤 𝑖 superscript rm 𝑥 subscript 𝑗 𝑖 subscript 𝑤 𝑖 subscript rm 𝑗 𝑥 subscript LLM superscript 𝜃 𝑛 1 𝑥 𝐷 subscript LLM superscript 𝜃 𝑛 1 subscript LLM subscript 𝜃 init \displaystyle\sum_{x\in T^{\ast}}\left(w^{\prime}_{i}\text{rm}^{*}(x)+\sum_{j\neq i}w_{i}\text{rm}_{j}(x)\right)\text{LLM}_{\theta^{n+1}}(x)+D(\text{LLM}_{\theta^{n+1}},\text{LLM}_{\theta_{\text{init}}})
> \displaystyle> x T ( w i rm ( x ) + j i w i rm j ( x ) ) LLM θ n ( x ) + D ( LLM θ n , LLM θ init ) . subscript 𝑥 superscript 𝑇 subscript superscript 𝑤 𝑖 superscript rm 𝑥 subscript 𝑗 𝑖 subscript 𝑤 𝑖 subscript rm 𝑗 𝑥 subscript LLM superscript 𝜃 𝑛 𝑥 𝐷 subscript LLM superscript 𝜃 𝑛 subscript LLM subscript 𝜃 init \displaystyle\sum_{x\in T^{\ast}}\left(w^{\prime}_{i}\text{rm}^{*}(x)+\sum_{j\neq i}w_{i}\text{rm}_{j}(x)\right)\text{LLM}_{\theta^{n}}(x)+D(\text{LLM}_{\theta^{n}},\text{LLM}_{\theta_{\text{init}}}).

Note that rm ( x ) = 1 | T | superscript rm 𝑥 1 superscript 𝑇 \text{rm}^{*}(x)=\frac{1}{|T^{\ast}|} for all x T 𝑥 superscript 𝑇 x\in T^{\ast} , we can calculate that x T ( w i w i ) rm ( x ) LLM θ ( x ) = w i w i | T | subscript 𝑥 superscript 𝑇 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 superscript rm 𝑥 subscript LLM 𝜃 𝑥 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 superscript 𝑇 \sum_{x\in T^{\ast}}(w^{\prime}_{i}-w_{i})\text{rm}^{*}(x)\text{LLM}_{\theta}(x)=\frac{w^{\prime}_{i}-w_{i}}{|T^{\ast}|} . Thus, the above equation can rewritten as:

w i w i | T | + x T ( w i rm ( x ) + j i w i rm j ( x ) ) LLM θ n + 1 ( x ) + D ( LLM θ n + 1 , LLM θ init ) subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 superscript 𝑇 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 superscript rm 𝑥 subscript 𝑗 𝑖 subscript 𝑤 𝑖 subscript rm 𝑗 𝑥 subscript LLM superscript 𝜃 𝑛 1 𝑥 𝐷 subscript LLM superscript 𝜃 𝑛 1 subscript LLM subscript 𝜃 init \displaystyle\frac{w^{\prime}_{i}-w_{i}}{|T^{\ast}|}+\sum_{x\in T^{\ast}}\left(w_{i}\text{rm}^{*}(x)+\sum_{j\neq i}w_{i}\text{rm}_{j}(x)\right)\text{LLM}_{\theta^{n+1}}(x)+D(\text{LLM}_{\theta^{n+1}},\text{LLM}_{\theta_{\text{init}}})
> \displaystyle> w i w i | T | + x T ( w i rm ( x ) + j i w i rm j ( x ) ) LLM θ n ( x ) + D ( LLM θ n , LLM θ init ) . subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 superscript 𝑇 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 superscript rm 𝑥 subscript 𝑗 𝑖 subscript 𝑤 𝑖 subscript rm 𝑗 𝑥 subscript LLM superscript 𝜃 𝑛 𝑥 𝐷 subscript LLM superscript 𝜃 𝑛 subscript LLM subscript 𝜃 init \displaystyle\frac{w^{\prime}_{i}-w_{i}}{|T^{\ast}|}+\sum_{x\in T^{\ast}}\left(w_{i}\text{rm}^{*}(x)+\sum_{j\neq i}w_{i}\text{rm}_{j}(x)\right)\text{LLM}_{\theta^{n}}(x)+D(\text{LLM}_{\theta^{n}},\text{LLM}_{\theta_{\text{init}}}).

This contradicted the optimality of θ n superscript 𝜃 𝑛 \theta^{n} . Therefore, θ n superscript 𝜃 𝑛 \theta^{n} and θ n + 1 superscript 𝜃 𝑛 1 \theta^{n+1} must be identical, which means that (b) equals to 0 0 .

Then we turn to (a). By the construction, for any x T 𝑥 superscript 𝑇 x\in T^{\ast} and 0 j n 1 0 𝑗 𝑛 1 0\leq j\leq n-1 , | w i j rm i j ( x ) w i j rm i j + 1 ( x ) | w ¯ n δ superscript subscript 𝑤 𝑖 𝑗 subscript superscript rm 𝑗 𝑖 𝑥 superscript subscript 𝑤 𝑖 𝑗 subscript superscript rm 𝑗 1 𝑖 𝑥 ¯ 𝑤 𝑛 𝛿 |w_{i}^{j}\text{rm}^{j}_{i}(x)-w_{i}^{j}\text{rm}^{j+1}_{i}(x)|\leq\frac{\bar{w}}{n}\leq\delta , so that | LLM θ j ( x ) LLM θ j + 1 ( x ) | ϵ 4 w ¯ subscript LLM superscript 𝜃 𝑗 𝑥 subscript LLM superscript 𝜃 𝑗 1 𝑥 italic-ϵ 4 ¯ 𝑤 |\text{LLM}_{\theta^{j}}(x)-\text{LLM}_{\theta^{j+1}}(x)|\leq\frac{\epsilon}{4\bar{w}} holds for all x 𝑥 x . Then we can derive that:

j = 0 n 1 x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( rm i j + 1 ( x ) rm i j ( x ) ) superscript subscript 𝑗 0 𝑛 1 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=0}^{n-1}\sum_{x\in T^{\ast}}w_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(\text{rm}_{i}^{j+1}(x)-\text{rm}_{i}^{j}(x))
= \displaystyle= j = 0 n 1 x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) rm ( x ) rm i ( x ) n superscript subscript 𝑗 0 𝑛 1 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript rm 𝑥 subscript rm 𝑖 𝑥 𝑛 \displaystyle\sum_{j=0}^{n-1}\sum_{x\in T^{\ast}}w_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))\frac{\text{rm}^{*}(x)-\text{rm}_{i}(x)}{n}
\displaystyle\leq j = 0 n 1 x T w i ϵ 4 w ¯ | rm ( x ) rm i ( x ) | n superscript subscript 𝑗 0 𝑛 1 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 italic-ϵ 4 ¯ 𝑤 superscript rm 𝑥 subscript rm 𝑖 𝑥 𝑛 \displaystyle\sum_{j=0}^{n-1}\sum_{x\in T^{\ast}}w_{i}\frac{\epsilon}{4\bar{w}}\frac{|\text{rm}^{*}(x)-\text{rm}_{i}(x)|}{n}
\displaystyle\leq x T ϵ 4 | rm ( x ) rm i ( x ) | subscript 𝑥 superscript 𝑇 italic-ϵ 4 superscript rm 𝑥 subscript rm 𝑖 𝑥 \displaystyle\sum_{x\in T^{\ast}}\frac{\epsilon}{4}|\text{rm}^{*}(x)-\text{rm}_{i}(x)|
\displaystyle\leq x T ϵ 4 ( rm ( x ) + rm i ( x ) ) subscript 𝑥 superscript 𝑇 italic-ϵ 4 superscript rm 𝑥 subscript rm 𝑖 𝑥 \displaystyle\sum_{x\in T^{\ast}}\frac{\epsilon}{4}(\text{rm}^{*}(x)+\text{rm}_{i}(x))
= \displaystyle= ϵ 2 . italic-ϵ 2 \displaystyle\frac{\epsilon}{2}.

The case is similar to (c). By the construction, for any x T 𝑥 superscript 𝑇 x\in T^{\ast} and n + 1 j 2 n 𝑛 1 𝑗 2 𝑛 n+1\leq j\leq 2n , | w i j rm i j ( x ) w i j rm i j + 1 ( x ) | w ¯ n δ superscript subscript 𝑤 𝑖 𝑗 subscript superscript rm 𝑗 𝑖 𝑥 superscript subscript 𝑤 𝑖 𝑗 subscript superscript rm 𝑗 1 𝑖 𝑥 ¯ 𝑤 𝑛 𝛿 |w_{i}^{j}\text{rm}^{j}_{i}(x)-w_{i}^{j}\text{rm}^{j+1}_{i}(x)|\leq\frac{\bar{w}}{n}\leq\delta , so that | LLM θ j ( x ) LLM θ j + 1 ( x ) | ϵ 4 w ¯ subscript LLM superscript 𝜃 𝑗 𝑥 subscript LLM superscript 𝜃 𝑗 1 𝑥 italic-ϵ 4 ¯ 𝑤 |\text{LLM}_{\theta^{j}}(x)-\text{LLM}_{\theta^{j+1}}(x)|\leq\frac{\epsilon}{4\bar{w}} holds for all x 𝑥 x . Then we can derive that:

j = n + 1 2 n x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) ( rm i j + 1 ( x ) rm i j ( x ) ) superscript subscript 𝑗 𝑛 1 2 𝑛 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 superscript subscript rm 𝑖 𝑗 1 𝑥 superscript subscript rm 𝑖 𝑗 𝑥 \displaystyle\sum_{j=n+1}^{2n}\sum_{x\in T^{\ast}}w_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))(\text{rm}_{i}^{j+1}(x)-\text{rm}_{i}^{j}(x))
= \displaystyle= j = n + 1 2 n x T w i ( LLM θ j + 1 ( x ) LLM θ j ( x ) ) rm i ( x ) rm ( x ) n superscript subscript 𝑗 𝑛 1 2 𝑛 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 subscript LLM superscript 𝜃 𝑗 1 𝑥 subscript LLM superscript 𝜃 𝑗 𝑥 subscript superscript rm 𝑖 𝑥 superscript rm 𝑥 𝑛 \displaystyle\sum_{j=n+1}^{2n}\sum_{x\in T^{\ast}}w_{i}(\text{LLM}_{\theta^{j+1}}(x)-\text{LLM}_{\theta^{j}}(x))\frac{\text{rm}^{\prime}_{i}(x)-\text{rm}^{*}(x)}{n}
\displaystyle\leq j = n + 1 2 n x T w i ϵ 4 w ¯ | rm i ( x ) rm ( x ) | n superscript subscript 𝑗 𝑛 1 2 𝑛 subscript 𝑥 superscript 𝑇 subscript 𝑤 𝑖 italic-ϵ 4 ¯ 𝑤 subscript superscript rm 𝑖 𝑥 superscript rm 𝑥 𝑛 \displaystyle\sum_{j=n+1}^{2n}\sum_{x\in T^{\ast}}w_{i}\frac{\epsilon}{4\bar{w}}\frac{|\text{rm}^{\prime}_{i}(x)-\text{rm}^{*}(x)|}{n}
\displaystyle\leq x T ϵ 4 | rm i ( x ) rm ( x ) | subscript 𝑥 superscript 𝑇 italic-ϵ 4 subscript superscript rm 𝑖 𝑥 superscript rm 𝑥 \displaystyle\sum_{x\in T^{\ast}}\frac{\epsilon}{4}|\text{rm}^{\prime}_{i}(x)-\text{rm}^{*}(x)|
\displaystyle\leq x T ϵ 4 ( rm i ( x ) + rm ( x ) ) subscript 𝑥 superscript 𝑇 italic-ϵ 4 subscript superscript rm 𝑖 𝑥 superscript rm 𝑥 \displaystyle\sum_{x\in T^{\ast}}\frac{\epsilon}{4}(\text{rm}^{\prime}_{i}(x)+\text{rm}^{*}(x))
= \displaystyle= ϵ 2 . italic-ϵ 2 \displaystyle\frac{\epsilon}{2}.

Combining the results from (a), (b), and (c), we have that under this construction,

j = 0 k l ( t i j , t i j + 1 ) + j = 0 k l ( t i j + 1 , t i j ) ϵ 2 + ϵ 2 = ϵ . superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑗 0 𝑘 𝑙 superscript subscript 𝑡 𝑖 𝑗 1 superscript subscript 𝑡 𝑖 𝑗 italic-ϵ 2 italic-ϵ 2 italic-ϵ \sum_{j=0}^{k}l(t_{i}^{j},t_{i}^{j+1})+\sum_{j=0}^{k}l(t_{i}^{j+1},t_{i}^{j})\leq\frac{\epsilon}{2}+\frac{\epsilon}{2}=\epsilon.

By the arbitrariness of ϵ > 0 italic-ϵ 0 \epsilon>0 , this is suffice to demonstrate that V ( t i , t i ) + V ( t i , t i ) 0 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 0 V(t_{i},t^{\prime}_{i})+V(t_{i},t^{\prime}_{i})\leq 0 .

Therefore, it is proven that

V ( t i , t i ) + V ( t i , t i ) = 0 . 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 0 V(t_{i},t^{\prime}_{i})+V(t_{i},t^{\prime}_{i})=0.

which means that V ( t i , t i ) = V ( t i , t i ) 𝑉 subscript 𝑡 𝑖 subscript superscript 𝑡 𝑖 𝑉 subscript superscript 𝑡 𝑖 subscript 𝑡 𝑖 V(t_{i},t^{\prime}_{i})=-V(t^{\prime}_{i},t_{i}) . By Lemma B.1 , this is a sufficient condition for the payment equivalence of ψ 𝜓 {\psi} . ∎

See 4.6

Proof.

Given the payment equivalence of ψ 𝜓 {\psi} and we know that p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} satisfies DSIC, we can formulate the problem of finding the revenue-maximizing DSIC and IR payment rule as a programming problem. Because of the symmetricity, we only consider the payment for agent i 𝑖 i here.

max h i subscript subscript 𝑖 \displaystyle\max_{h_{i}}\quad 𝔼 rm n , w 𝒲 n [ p i A F F ( rm , w , θ init ) + h i ( rm i , w i , θ init ) ] subscript 𝔼 formulae-sequence rm superscript 𝑛 𝑤 superscript 𝒲 𝑛 delimited-[] superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 rm 𝑤 subscript 𝜃 init subscript 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \displaystyle\mathbb{E}_{\overrightarrow{\text{rm}}\in\mathcal{R}^{n},\vec{w}\in\mathcal{W}^{n}}\left[p_{i}^{AFF}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})+h_{i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}})\right]
s.t. p i A F F ( rm , w , θ init ) + h i ( rm i , w i , θ init ) w i v i ( ψ ( rm , w , θ init ) ; rm i ) rm i , w i 𝒲 . formulae-sequence superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 rm 𝑤 subscript 𝜃 init subscript 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜓 rm 𝑤 subscript 𝜃 init subscript rm 𝑖 formulae-sequence for-all subscript rm 𝑖 subscript 𝑤 𝑖 𝒲 \displaystyle p_{i}^{AFF}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})+h_{i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}})\leq w_{i}v_{i}({\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}});\text{rm}_{i})\quad\forall\text{rm}_{i}\in\mathcal{R},w_{i}\in\mathcal{W}.

The solution of this programming can be trivially given by,

h i ( rm i , w i , θ init ) = subscript 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init absent \displaystyle h_{i}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}})= inf rm i , w i 𝒲 w i v i ( ψ ( ( rm i , rm i ) , θ init ) ; rm i ) p i A F F ( ( rm i , rm i ) , ( w i , w i ) ; θ init ) subscript infimum formulae-sequence subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 𝒲 subscript superscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 subscript 𝜃 init subscript superscript rm 𝑖 superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \displaystyle\inf_{\text{rm}^{\prime}_{i}\in\mathcal{R},w^{\prime}_{i}\in\mathcal{W}}w^{\prime}_{i}v_{i}({\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\theta_{\text{init}});\text{rm}^{\prime}_{i})-p_{i}^{AFF}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}})
= : absent : \displaystyle=: inf rm i , w i 𝒲 u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p A F F , rm i , w i ) . subscript infimum formulae-sequence subscript superscript rm 𝑖 subscript 𝑤 𝑖 𝒲 subscript 𝑢 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 \displaystyle\inf_{\text{rm}^{\prime}_{i}\in\mathcal{R},w_{i}\in\mathcal{W}}u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p^{AFF},\text{rm}^{\prime}_{i},w^{\prime}_{i}).

Therefore, the revenue-maximizing payment is

p i ( ( rm i , rm i ) , ( w i , w i ) , θ init ) = subscript 𝑝 𝑖 subscript rm 𝑖 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init absent \displaystyle p_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),(w_{i},\vec{w}_{-i}),\theta_{\text{init}})= p i A F F ( ( rm i , rm i ) , ( w i , w i ) ; θ init ) superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \displaystyle p_{i}^{AFF}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});\theta_{\text{init}})
+ \displaystyle+ inf rm i , w i 𝒲 u i ( ( rm i , rm i ) , ( w i , w i ) ; ψ , p A F F , rm i , w i ) . subscript infimum formulae-sequence subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 𝒲 subscript 𝑢 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 subscript superscript 𝑤 𝑖 subscript 𝑤 𝑖 𝜓 superscript 𝑝 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript superscript 𝑤 𝑖 \displaystyle\inf_{\text{rm}^{\prime}_{i}\in\mathcal{R},w^{\prime}_{i}\in\mathcal{W}}u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),(w^{\prime}_{i},\vec{w}_{-i});{\psi},p^{AFF},\text{rm}^{\prime}_{i},w^{\prime}_{i}).

Lemma B.2 .

For any rm , rm rm superscript rm \text{rm},\text{rm}^{\prime} , if max 𝐱 T | rm ( 𝐱 ) rm ( 𝐱 ) | = ϵ subscript 𝐱 superscript 𝑇 rm 𝐱 superscript rm 𝐱 italic-ϵ \max_{{\bm{x}}\in T^{\ast}}|\text{rm}({\bm{x}})-\text{rm}^{\prime}({\bm{x}})|=\epsilon , then for any model θ 𝜃 \theta , we have

| v ( θ ; rm ) v ( θ ; rm ) | ϵ 𝑣 𝜃 rm 𝑣 𝜃 superscript rm italic-ϵ |v(\theta;\text{rm})-v(\theta;\text{rm}^{\prime})|\leq\epsilon
Proof.

We can derive that

| v ( θ ; rm ) v ( θ ; rm ) | 𝑣 𝜃 rm 𝑣 𝜃 superscript rm \displaystyle|v(\theta;\text{rm})-v(\theta;\text{rm}^{\prime})| = | 𝒙 T LLM θ ( 𝒙 ) ( rm ( 𝒙 ) rm ( 𝒙 ) ) | 𝒙 T LLM θ ( 𝒙 ) | rm ( 𝒙 ) rm ( 𝒙 ) ) | \displaystyle=|\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})(\text{rm}({\bm{x}})-\text{rm}^{\prime}({\bm{x}}))|\leq\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})|\text{rm}({\bm{x}})-\text{rm}^{\prime}({\bm{x}}))|
𝒙 T LLM θ ( 𝒙 ) ϵ = ϵ . absent subscript 𝒙 superscript 𝑇 subscript LLM 𝜃 𝒙 italic-ϵ italic-ϵ \displaystyle\leq\sum_{{\bm{x}}\in T^{\ast}}\text{LLM}_{\theta}({\bm{x}})\epsilon=\epsilon.

See 4.8

Proof.

Let θ ^ = ψ ( rm ^ , w , θ init ) ^ 𝜃 𝜓 ^ rm 𝑤 subscript 𝜃 init \hat{\theta}={\psi}(\overrightarrow{\widehat{\text{rm}}},\vec{w},\theta_{\text{init}}) and θ = ψ ( rm , w , θ init ) 𝜃 𝜓 rm 𝑤 subscript 𝜃 init \theta={\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}}) .

A S W ( rm , w , θ ^ ; θ init ) 𝐴 𝑆 𝑊 rm 𝑤 ^ 𝜃 subscript 𝜃 init \displaystyle ASW(\overrightarrow{\text{rm}},\vec{w},\hat{\theta};\theta_{\text{init}}) = i = 1 n w i v i ( θ ^ ; rm i ) λ D ( θ ^ | | θ init ) \displaystyle=\sum_{i=1}^{n}w_{i}v_{i}(\hat{\theta};\text{rm}_{i})-\lambda D(\hat{\theta}||\theta_{\text{init}})
( 1 ) i = 1 n w i ( v i ( θ ^ ; rm ^ i ) ϵ ) λ D ( θ ^ | | θ init ) \displaystyle\overset{(1)}{\geq}\sum_{i=1}^{n}w_{i}\left(v_{i}(\hat{\theta};\widehat{\text{rm}}_{i})-\epsilon\right)-\lambda D(\hat{\theta}||\theta_{\text{init}})
= A S W ( rm ^ , w , θ ^ ; θ init ) i = 1 n w i ϵ absent 𝐴 𝑆 𝑊 ^ rm 𝑤 ^ 𝜃 subscript 𝜃 init superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 italic-ϵ \displaystyle=ASW(\overrightarrow{\widehat{\text{rm}}},\vec{w},\hat{\theta};\theta_{\text{init}})-\sum_{i=1}^{n}w_{i}\epsilon
( 2 ) A S W ( rm ^ , w , θ ; θ init ) i = 1 n w i ϵ 2 𝐴 𝑆 𝑊 ^ rm 𝑤 𝜃 subscript 𝜃 init superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 italic-ϵ \displaystyle\overset{(2)}{\geq}ASW(\overrightarrow{\widehat{\text{rm}}},\vec{w},\theta;\theta_{\text{init}})-\sum_{i=1}^{n}w_{i}\epsilon
= i = 1 n w i v i ( θ ; rm ^ i ) λ D ( θ | | θ init ) i = 1 n w i ϵ \displaystyle=\sum_{i=1}^{n}w_{i}v_{i}(\theta;\widehat{\text{rm}}_{i})-\lambda D(\theta||\theta_{\text{init}})-\sum_{i=1}^{n}w_{i}\epsilon
( 3 ) i = 1 n w i ( v i ( θ ; rm i ) ϵ ) λ D ( θ | | θ init ) i = 1 n w i ϵ \displaystyle\overset{(3)}{\geq}\sum_{i=1}^{n}w_{i}\left(v_{i}(\theta;\text{rm}_{i})-\epsilon\right)-\lambda D(\theta||\theta_{\text{init}})-\sum_{i=1}^{n}w_{i}\epsilon
= A S W ( rm , w , θ ; θ init ) 2 i = 1 n w i ϵ . absent 𝐴 𝑆 𝑊 rm 𝑤 𝜃 subscript 𝜃 init 2 superscript subscript 𝑖 1 𝑛 subscript 𝑤 𝑖 italic-ϵ \displaystyle=ASW(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}})-2\sum_{i=1}^{n}w_{i}\epsilon.

(1) and (3) can be directly induced by Lemma B.2 , and (2) holds by the definition of θ ^ ^ 𝜃 \hat{\theta} .

θ ^ = ψ ( rm ^ , w , θ init ) = arg max θ Θ A S W ( rm ^ , w , θ ; θ init ) . ^ 𝜃 𝜓 ^ rm 𝑤 subscript 𝜃 init subscript 𝜃 Θ 𝐴 𝑆 𝑊 ^ rm 𝑤 𝜃 subscript 𝜃 init \hat{\theta}={\psi}(\overrightarrow{\widehat{\text{rm}}},\vec{w},\theta_{\text{init}})=\arg\max_{\theta\in\Theta}ASW(\overrightarrow{\widehat{\text{rm}}},\vec{w},\theta;\theta_{\text{init}}).

See 4.9

Proof.

Recall that the calculation of payment in p A F F superscript 𝑝 𝐴 𝐹 𝐹 p^{AFF} is

p i A F F ( rm , w , θ init ) = A S W i superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 rm 𝑤 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 \displaystyle p_{i}^{AFF}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}})=ASW_{-i} ( rm , w , ψ ( rm i , w i , θ init ) ; θ init ) rm 𝑤 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init subscript 𝜃 init \displaystyle(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}});\theta_{\text{init}})
A S W i ( rm , w , ψ ( rm , w , θ init ) ; θ init ) . 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 𝜓 rm 𝑤 subscript 𝜃 init subscript 𝜃 init \displaystyle-ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},{\psi}(\overrightarrow{\text{rm}},\vec{w},\theta_{\text{init}});\theta_{\text{init}}).

Let w = ( w i , w i ) 𝑤 subscript 𝑤 𝑖 subscript 𝑤 𝑖 \vec{w}=(w_{i},\vec{w}_{-i}) , the utility function can be written as:

u i ( ( rm i , rm i ) , w ; ψ , p , rm i , w i ) subscript 𝑢 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 𝑤 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 \displaystyle u_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w};{\psi},p,\text{rm}_{i},w_{i}) = w i v i ( θ ; rm i ) p i A F F ( ( rm i , rm i ) , w , θ init ) absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 superscript subscript 𝑝 𝑖 𝐴 𝐹 𝐹 subscript superscript rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init \displaystyle=w_{i}v_{i}(\theta;\text{rm}_{i})-p_{i}^{AFF}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}})
= w i v i ( θ ; rm i ) A S W i ( rm , w , θ i ; θ init ) + A S W i ( rm , w , θ ; θ init ) absent subscript 𝑤 𝑖 subscript 𝑣 𝑖 𝜃 subscript rm 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 𝜃 subscript 𝜃 init \displaystyle=w_{i}v_{i}(\theta;\text{rm}_{i})-ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}})
= A S W ( rm , w , θ ; θ init ) A S W i ( rm , w , θ i ; θ init ) , absent 𝐴 𝑆 𝑊 rm 𝑤 𝜃 subscript 𝜃 init 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle=ASW(\overrightarrow{\text{rm}},\vec{w},\theta;\theta_{\text{init}})-ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}}),

where we define θ = ψ ( ( rm i , rm i ) , w , θ init ) 𝜃 𝜓 subscript superscript rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init \theta={\psi}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}}) , and θ i = ψ ( rm i , w i , θ init ) subscript 𝜃 𝑖 𝜓 subscript rm 𝑖 subscript 𝑤 𝑖 subscript 𝜃 init \theta_{-i}={\psi}(\overrightarrow{\text{rm}}_{-i},\vec{w}_{-i},\theta_{\text{init}}) . Note that the term A S W i ( rm , w , θ i ; θ init ) 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}}) is not influenced by the change of rm i subscript rm 𝑖 \text{rm}_{i} or w i subscript 𝑤 𝑖 w_{i} .

Therefore, we can derive that:

U i ( ( rm i , rm i ) , w ; ψ , p , rm i , w i ) + A S W i ( rm , w , θ i ; θ init ) subscript 𝑈 𝑖 subscript rm 𝑖 subscript rm 𝑖 𝑤 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init \displaystyle U_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w};{\psi},p,\text{rm}_{i},w_{i})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})
= \displaystyle= 𝔼 rm ^ i i ( | rm i ) [ u i ( ( rm ^ i , rm i ) , w ; ψ , p , rm i , w i ) + A S W i ( rm , w , θ i ; θ init ) ] \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[u_{i}((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w};{\psi},p,\text{rm}_{i},w_{i})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})\right]
= \displaystyle= 𝔼 rm ^ i i ( | rm i ) [ A S W ( rm , w , θ ^ ; θ init ) ] \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[ASW(\overrightarrow{\text{rm}},\vec{w},\hat{\theta};\theta_{\text{init}})\right]
= \displaystyle= 𝔼 rm ^ i i ( | rm i ) [ w i v i ( θ ^ ; rm i ) + j i w j v j ( θ ^ ; rm j ) λ D ( θ ^ | | θ init ) ] \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[w_{i}v_{i}(\hat{\theta};\text{rm}_{i})+\sum_{j\neq i}w_{j}v_{j}(\hat{\theta};\text{rm}_{j})-\lambda D(\hat{\theta}||\theta_{\text{init}})\right]
( 1 ) 1 \displaystyle\overset{(1)}{\geq} 𝔼 rm ^ i i ( | rm i ) [ w i v i ( θ ^ ; rm ^ i ) + j i w j v j ( θ ^ ; rm j ) λ D ( θ ^ | | θ init ) ] w i ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[w_{i}v_{i}(\hat{\theta};\widehat{\text{rm}}_{i})+\sum_{j\neq i}w_{j}v_{j}(\hat{\theta};\text{rm}_{j})-\lambda D(\hat{\theta}||\theta_{\text{init}})\right]-w_{i}\epsilon
( 2 ) 2 \displaystyle\overset{(2)}{\geq} 𝔼 rm ^ i i ( | rm i ) [ w i v i ( θ ; rm ^ i ) + j i w j v j ( θ ; rm j ) λ D ( θ | | θ init ) ] w i ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[w_{i}v_{i}(\theta;\widehat{\text{rm}}_{i})+\sum_{j\neq i}w_{j}v_{j}(\theta;\text{rm}_{j})-\lambda D(\theta||\theta_{\text{init}})\right]-w_{i}\epsilon
( 3 ) 3 \displaystyle\overset{(3)}{\geq} 𝔼 rm ^ i i ( | rm i ) [ w i v i ( θ ; rm i ) + j i w j v j ( θ ; rm j ) λ D ( θ | | θ init ) ] 2 w i ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\cdot|\text{rm}_{i})}\left[w_{i}v_{i}(\theta;\text{rm}_{i})+\sum_{j\neq i}w_{j}v_{j}(\theta;\text{rm}_{j})-\lambda D(\theta||\theta_{\text{init}})\right]-2w_{i}\epsilon
= ( 4 ) 4 \displaystyle\overset{(4)}{=} 𝔼 rm ^ i i ( rm i ) [ w i v i ( θ ; rm i ) + j i w j v j ( θ ; rm j ) λ D ( θ | | θ init ) ] 2 w i ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\text{rm}^{\prime}_{i})}\left[w_{i}v_{i}(\theta;\text{rm}_{i})+\sum_{j\neq i}w_{j}v_{j}(\theta;\text{rm}_{j})-\lambda D(\theta||\theta_{\text{init}})\right]-2w_{i}\epsilon
( 5 ) 5 \displaystyle\overset{(5)}{\geq} 𝔼 rm ^ i i ( rm i ) [ w i v i ( θ ^ ; rm i ) + j i w j v j ( θ ^ ; rm j ) λ D ( θ ^ | | θ init ) ] 2 w i ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\text{rm}^{\prime}_{i})}\left[w_{i}v_{i}(\hat{\theta};\text{rm}_{i})+\sum_{j\neq i}w_{j}v_{j}(\hat{\theta};\text{rm}_{j})-\lambda D(\hat{\theta}||\theta_{\text{init}})\right]-2w_{i}\epsilon
= \displaystyle= 𝔼 rm ^ i i ( rm i ) [ A S W ( rm , w , θ ^ ; θ init ) ] 2 w i ϵ subscript 𝔼 similar-to subscript ^ rm 𝑖 subscript 𝑖 subscript superscript rm 𝑖 delimited-[] 𝐴 𝑆 𝑊 rm 𝑤 ^ 𝜃 subscript 𝜃 init 2 subscript 𝑤 𝑖 italic-ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\text{rm}^{\prime}_{i})}\left[ASW(\overrightarrow{\text{rm}},\vec{w},\hat{\theta};\theta_{\text{init}})\right]-2w_{i}\epsilon
= \displaystyle= 𝔼 rm ^ i i ( rm i ) [ u i ( ( rm ^ i , rm i ) , w ; ψ , p , rm i , w i ) + A S W i ( rm , w , θ i ; θ init ) ] 2 w i ϵ subscript 𝔼 similar-to subscript ^ rm 𝑖 subscript 𝑖 subscript superscript rm 𝑖 delimited-[] subscript 𝑢 𝑖 subscript ^ rm 𝑖 subscript rm 𝑖 𝑤 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init 2 subscript 𝑤 𝑖 italic-ϵ \displaystyle\mathbb{E}_{\widehat{\text{rm}}_{i}\sim\mathcal{F}_{i}(\text{rm}^{\prime}_{i})}\left[u_{i}((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w};{\psi},p,\text{rm}_{i},w_{i})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})\right]-2w_{i}\epsilon
= \displaystyle= U i ( ( rm i , rm i ) , w ; ψ , p , rm i , w i ) + A S W i ( rm , w , θ i ; θ init ) 2 w i ϵ . subscript 𝑈 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 𝑤 𝜓 𝑝 subscript rm 𝑖 subscript 𝑤 𝑖 𝐴 𝑆 subscript 𝑊 𝑖 rm 𝑤 subscript 𝜃 𝑖 subscript 𝜃 init 2 subscript 𝑤 𝑖 italic-ϵ \displaystyle U_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w};{\psi},p,\text{rm}_{i},w_{i})+ASW_{-i}(\overrightarrow{\text{rm}},\vec{w},\theta_{-i};\theta_{\text{init}})-2w_{i}\epsilon.

All the θ ^ ^ 𝜃 \hat{\theta} in the above inequalities refer to the optimal parameter for input ( rm ^ i , rm i ) , w , θ init subscript ^ rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init (\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}} , i.e. θ ^ = ψ ( ( rm ^ i , rm i ) , w , θ init ) ^ 𝜃 𝜓 subscript ^ rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init \hat{\theta}={\psi}((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}}) . Specifically, (1) and (3) come from the bounded distance between rm i subscript rm 𝑖 \text{rm}_{i} and rm ^ i subscript ^ rm 𝑖 \widehat{\text{rm}}_{i} ( Lemma B.2 ). (2) and (5) hold by the definitions: θ ^ = ψ ( ( rm ^ i , rm i ) , w , θ init ) = arg max θ Θ A S W ( ( rm ^ i , rm i ) , w , θ ; θ init ) ^ 𝜃 𝜓 subscript ^ rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init subscript superscript 𝜃 Θ 𝐴 𝑆 𝑊 subscript ^ rm 𝑖 subscript rm 𝑖 𝑤 superscript 𝜃 subscript 𝜃 init \hat{\theta}={\psi}((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}})=\arg\max_{\theta^{\prime}\in\Theta}ASW((\widehat{\text{rm}}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta^{\prime};\theta_{\text{init}}) and θ = ψ ( ( rm i , rm i ) , w , θ init ) = arg max θ Θ A S W ( ( rm i , rm i ) , w , θ ; θ init ) 𝜃 𝜓 subscript rm 𝑖 subscript rm 𝑖 𝑤 subscript 𝜃 init subscript superscript 𝜃 Θ 𝐴 𝑆 𝑊 subscript rm 𝑖 subscript rm 𝑖 𝑤 superscript 𝜃 subscript 𝜃 init \theta={\psi}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta_{\text{init}})=\arg\max_{\theta^{\prime}\in\Theta}ASW((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i}),\vec{w},\theta^{\prime};\theta_{\text{init}}) . And (4) holds since the inner term is irrelevant to rm ^ i subscript ^ rm 𝑖 \widehat{\text{rm}}_{i} .

Therefore, we get

U i ( ( rm i , rm i ) ; ψ , p , rm i ) U i ( ( rm i , rm i ) ; ψ , p , rm i ) 2 w i ϵ . subscript 𝑈 𝑖 subscript rm 𝑖 subscript rm 𝑖 𝜓 𝑝 subscript rm 𝑖 subscript 𝑈 𝑖 subscript superscript rm 𝑖 subscript rm 𝑖 𝜓 𝑝 subscript rm 𝑖 2 subscript 𝑤 𝑖 italic-ϵ U_{i}((\text{rm}_{i},\overrightarrow{\text{rm}}_{-i});{\psi},p,\text{rm}_{i})\geq U_{i}((\text{rm}^{\prime}_{i},\overrightarrow{\text{rm}}_{-i});{\psi},p,\text{rm}_{i})-2w_{i}\epsilon.